Author name: Mike M.

scotus-kills-chevron-deference,-giving-courts-more-power-to-block-federal-rules

SCOTUS kills Chevron deference, giving courts more power to block federal rules

Supreme Court Chief Justice John Roberts and Associate Justice Sonia Sotomayor wearing their robes as they arrive for the State of the Union address.

Enlarge / Supreme Court Chief Justice John Roberts and Associate Justice Sonia Sotomayor arrive for President Joe Biden’s State of the Union address on March 7, 2024, in Washington, DC.

Getty Images | Win McNamee

The US Supreme Court today overturned the 40-year-old Chevron precedent in a ruling that limits the regulatory authority of federal agencies. The 6-3 decision in Loper Bright Enterprises v. Raimondo will make it harder for agencies such as the Federal Communications Commission and Environmental Protection Agency to issue regulations without explicit authorization from Congress.

Chief Justice John Roberts delivered the opinion of the court and was joined by Clarence Thomas, Samuel Alito, Neil Gorsuch, Brett Kavanaugh, and Amy Coney Barrett. Justice Elena Kagan filed a dissenting opinion that was joined by Sonia Sotomayor and Ketanji Brown Jackson.

Chevron gave agencies leeway to interpret ambiguous laws as long as the agency’s conclusion was reasonable. But the Roberts court said that a “statutory ambiguity does not necessarily reflect a congressional intent that an agency, as opposed to a court, resolve the resulting interpretive question.”

“Perhaps most fundamentally, Chevron‘s presumption is misguided because agencies have no special competence in resolving statutory ambiguities. Courts do,” the ruling said. “The Framers anticipated that courts would often confront statutory ambiguities and expected that courts would resolve them by exercising independent legal judgment. Chevron gravely erred in concluding that the inquiry is fundamentally different just because an administrative interpretation is in play.”

This is especially critical “when the ambiguity is about the scope of an agency’s own power—perhaps the occasion on which abdication in favor of the agency is least appropriate,” the court said. The Roberts opinion also said the Administrative Procedure Act “specifies that courts, not agencies, will decide ‘all relevant questions of law’ arising on review of agency action—even those involving ambiguous laws,” and “prescribes no deferential standard for courts to employ in answering those legal questions.”

Kagan: SCOTUS majority now “administrative czar”

The Loper Bright case involved a challenge to a rule enforced by the National Marine Fisheries Service. Lower courts applied the Chevron framework when ruling in favor of the government.

Kagan’s dissent said that Chevron “has become part of the warp and woof of modern government, supporting regulatory efforts of all kinds—to name a few, keeping air and water clean, food and drugs safe, and financial markets honest.”

Ambiguities should generally be resolved by agencies instead of courts, Kagan wrote. “This Court has long understood Chevron deference to reflect what Congress would want, and so to be rooted in a presumption of legislative intent. Congress knows that it does not—in fact cannot—write perfectly complete regulatory statutes. It knows that those statutes will inevitably contain ambiguities that some other actor will have to resolve, and gaps that some other actor will have to fill. And it would usually prefer that actor to be the responsible agency, not a court,” the dissent said.

The Roberts court ruling “flips the script: It is now ‘the courts (rather than the agency)’ that will wield power when Congress has left an area of interpretive discretion,” Kagan wrote. “A rule of judicial humility gives way to a rule of judicial hubris.”

Kagan wrote that the court in recent years “has too often taken for itself decision-making authority Congress assigned to agencies,” substituting “its own judgment on workplace health for that of the Occupational Safety and Health Administration; its own judgment on climate change for that of the Environmental Protection Agency; and its own judgment on student loans for that of the Department of Education.”

Apparently deciding those previous decisions were “too piecemeal,” the court “majority today gives itself exclusive power over every open issue—no matter how expertise-driven or policy-laden—involving the meaning of regulatory law,” Kagan wrote. “As if it did not have enough on its plate, the majority turns itself into the country’s administrative czar. It defends that move as one (suddenly) required by the (nearly 80-year-old) Administrative Procedure Act. But the Act makes no such demand. Today’s decision is not one Congress directed. It is entirely the majority’s choice.”

The unanimous 1984 SCOTUS ruling in Chevron U.S.A. Inc. v. Natural Resources Defense Council involved the Environmental Protection Agency and air pollution rules. Even with Chevron deference in place, the EPA faced limits to its regulatory power. A Supreme Court ruling earlier this week imposed a stay on rules meant to limit the spread of ozone-generating pollutants across state lines.

Consumer advocacy group Public Knowledge criticized today’s ruling, saying that it “grounds judicial superiority over the legislative and executive branches by declaring that the Constitution requires judges to unilaterally decide the meaning of statutes written by Congress and entrusted to agencies.”

Public Knowledge Senior VP Harold Feld argued that after today’s ruling, “no consumer protection is safe. Even if Congress can write with such specificity that a court cannot dispute its plain meaning, Congress will need to change the law for every new technology and every change in business practice. Even at the best of times, it would be impossible for Congress to keep up. Given the dysfunction of Congress today, we are at the mercy of the whims of the Imperial Court.”

SCOTUS kills Chevron deference, giving courts more power to block federal rules Read More »

the-world’s-toughest-race-starts-saturday,-and-it’s-delightfully-hard-to-call-this-year

The world’s toughest race starts Saturday, and it’s delightfully hard to call this year

Is it Saturday yet? —

Setting the stage for what could be a wild ride across France.

The peloton passing through a sunflowers field during the stage eight of the 110th Tour de France in 2023.

Enlarge / The peloton passing through a sunflowers field during the stage eight of the 110th Tour de France in 2023.

David Ramos/Getty Images

Most readers probably did not anticipate seeing a Tour de France preview on Ars Technica, but here we are. Cycling is a huge passion of mine and several other staffers, and this year, a ton of intrigue surrounds the race, which has a fantastic route. So we’re here to spread Tour fever.

The three-week race starts Saturday, paradoxically in the Italian region of Tuscany. Usually, there is a dominant rider, or at most two, and a clear sense of who is likely to win the demanding race. But this year, due to rider schedules, a terrible crash in early April, and new contenders, there is more uncertainty than usual. A solid case could be made for at least four riders to win this year’s Tour de France.

For people who aren’t fans of pro road cycling—which has to be at least 99 percent of the United States—there’s a great series on Netflix called Unchained to help get you up to speed. The second season, just released, covers last year’s Tour de France and introduces you to most of the protagonists in the forthcoming edition. If this article sparks your interest, I recommend checking it out.

Anyway, for those who are cycling curious, I want to set the stage for this year’s race by saying a little bit about the four main contenders, from most likely to least likely to win, and provide some of the backstory to what could very well be a dramatic race this year.

Tadej Pogačar

Tadej Pogacar of Slovenia and UAE Team Emirates won the Giro d'Italia in May.

Enlarge / Tadej Pogacar of Slovenia and UAE Team Emirates won the Giro d’Italia in May.

Tim de Waele/Getty Images

  • Slovenia
  • 25 years old
  • UAE Team Emirates
  • Odds: -190

Pogačar burst onto the scene in 2019 at the very young age of 20 by finishing third in the Vuelta a España, one of the three grand tours of cycling. He then went on to win the 2020 and 2021 Tours de France, first by surprising fellow countryman Primož Roglič (more on him below) in 2020 and then utterly dominating in 2021. Given his youth, it seemed he would be the premiere grand tour competitor for the next decade.

But then another slightly older rider, a teammate of Roglič’s named Jonas Vingegaard, emerged in 2022 and won the next two races. Last year, in fact, Vingegaard cracked Pogačar by 7 minutes and 29 seconds in the Tour, a huge winning margin, especially for two riders of relatively close talent. This established Vingegaard as the alpha male of grand tour cyclists, having proven himself a better climber and time trialist than Pogačar, especially in the highest and hardest stages.

So this year, Pogačar decided to change up his strategy. Instead of focusing on the Tour de France, Pogačar participated in the first grand tour of the season, the Giro d’Italia, which occurred in May. He likely did so for a couple of reasons. First of all, he almost certainly received a generous appearance fee from the Italian organizers. And secondly, riding the Giro would give him a ready excuse for not beating Vingegaard in France.

Why is this? Because there are just five weeks between the end of the Giro and the start of the Tour. So if a rider peaks for the Giro and exerts himself in winning the race, it is generally thought that he can’t arrive at the Tour in winning form. He will be a few percent off, not having ideal preparation.

Predictably, Pogačar smashed the lesser competition at the Giro and won the race by 9 minutes and 56 seconds. Because he was so far ahead, he was able to take the final week of the race a bit easier. The general thinking in the cycling community is that Pogačar is arriving at the Tour in excellent but not peak form. But given everything else that has happened so far this season, the bettors believe that will be enough for him to win. Maybe.

The world’s toughest race starts Saturday, and it’s delightfully hard to call this year Read More »

monitoring-and-analytics:-the-eyes-and-ears-of-zero-trust

Monitoring and Analytics: The Eyes and Ears of Zero Trust

Welcome back to our zero trust blog series! In our previous post, we took a deep dive into API security and explored best practices for securing this critical component of modern application architectures. Today, we’re turning our attention to another essential aspect of zero trust: monitoring and analytics.

In a zero trust model, visibility is everything. With no implicit trust granted to any user, device, or application, organizations must continuously monitor and analyze all activity across their environment to detect and respond to potential threats in real-time.

In this post, we’ll explore the role of monitoring and analytics in a zero trust model, discuss the key data sources and technologies involved, and share best practices for building a comprehensive monitoring and analytics strategy.

The Role of Monitoring and Analytics in Zero Trust

In a traditional perimeter-based security model, monitoring and analytics often focus on detecting threats at the network boundary. However, in a zero trust model, the perimeter is everywhere, and threats can come from any user, device, or application, both inside and outside the organization.

To mitigate these risks, zero trust requires organizations to take a comprehensive, data-driven approach to monitoring and analytics. This involves:

  1. Continuous monitoring: Collecting and analyzing data from all relevant sources, including users, devices, applications, and infrastructure, in real-time.
  2. Behavioral analytics: Using machine learning and other advanced analytics techniques to identify anomalous or suspicious behavior that may indicate a potential threat.
  3. Automated response: Leveraging automation and orchestration tools to quickly investigate and remediate potential threats, minimizing the impact of security incidents.
  4. Continuous improvement: Using insights from monitoring and analytics to continuously refine and optimize security policies, controls, and processes.

By applying these principles, organizations can create a more proactive, adaptive security posture that can detect and respond to threats faster and more effectively than traditional approaches.

Key Data Sources and Technologies for Zero Trust Monitoring and Analytics

To build a comprehensive monitoring and analytics strategy for zero trust, organizations must collect and analyze data from a wide range of sources, including:

  1. Identity and access management (IAM) systems: Data on user identities, roles, and permissions, as well as authentication and authorization events.
  2. Endpoint detection and response (EDR) tools: Data on device health, configuration, and activity, as well as potential threats and vulnerabilities.
  3. Network security tools: Data on network traffic, including flow logs, packet captures, and intrusion detection and prevention system (IDPS) events.
  4. Application performance monitoring (APM) tools: Data on application performance, errors, and potential security issues, such as injection attacks or data exfiltration attempts.
  5. Cloud security posture management (CSPM) tools: Data on cloud resource configurations, compliance with security policies, and potential misconfigurations or vulnerabilities.

To collect, process, and analyze this data, organizations can leverage a range of technologies, including:

  1. Security information and event management (SIEM) platforms: Centralized platforms for collecting, normalizing, and analyzing security event data from multiple sources.
  2. User and entity behavior analytics (UEBA) tools: Advanced analytics tools that use machine learning to identify anomalous or suspicious behavior by users, devices, and applications.
  3. Security orchestration, automation, and response (SOAR) platforms: Tools that automate and orchestrate security processes, such as incident response and remediation, based on predefined playbooks and workflows.
  4. Big data platforms: Scalable platforms for storing, processing, and analyzing large volumes of structured and unstructured security data, such as Hadoop, Spark, and Elasticsearch.

By leveraging these data sources and technologies, organizations can build a comprehensive, data-driven monitoring and analytics strategy that can detect and respond to threats in real-time.

Best Practices for Zero Trust Monitoring and Analytics

Implementing a zero trust approach to monitoring and analytics requires a comprehensive, multi-layered strategy. Here are some best practices to consider:

  1. Identify and prioritize data sources: Identify all relevant data sources across your environment, and prioritize them based on their level of risk and criticality. Focus on collecting data from high-risk sources first, such as IAM systems, EDR tools, and critical applications.
  2. Establish a centralized logging and monitoring platform: Implement a centralized platform, such as a SIEM or big data platform, to collect, normalize, and analyze security event data from multiple sources. Ensure that the platform can scale to handle the volume and variety of data generated by a zero trust environment.
  3. Implement behavioral analytics: Leverage UEBA tools and machine learning algorithms to identify anomalous or suspicious behavior by users, devices, and applications. Focus on detecting behavior that deviates from established baselines or patterns, such as unusual login attempts, data access patterns, or network traffic.
  4. Automate incident response and remediation: Implement SOAR tools and automated playbooks to quickly investigate and remediate potential threats. Ensure that playbooks are aligned with zero trust principles, such as least privilege access and continuous verification.
  5. Continuously monitor and refine policies and controls: Use insights from monitoring and analytics to continuously refine and optimize security policies, controls, and processes. Regularly review and update policies based on changes in the threat landscape, business requirements, and user behavior.
  6. Foster a culture of continuous improvement: Encourage a culture of continuous learning and improvement across the organization. Regularly share insights and lessons learned from monitoring and analytics with stakeholders, and use them to drive ongoing enhancements to the zero trust strategy.

By implementing these best practices and continuously refining your monitoring and analytics posture, you can better protect your organization’s assets and data from the risks posed by evolving threats and changing business requirements.

Conclusion

In a zero trust world, monitoring and analytics are the eyes and ears of the security organization. By continuously collecting and analyzing data from all relevant sources, organizations can detect and respond to potential threats faster and more effectively than ever before.

However, achieving effective monitoring and analytics in a zero trust model requires a commitment to leveraging the right data sources and technologies, implementing behavioral analytics and automation, and fostering a culture of continuous improvement. It also requires a shift in mindset, from a reactive, perimeter-based approach to a proactive, data-driven approach that assumes no implicit trust.

As you continue your zero trust journey, make monitoring and analytics a top priority. Invest in the tools, processes, and skills necessary to build a comprehensive monitoring and analytics strategy, and regularly assess and refine your approach to keep pace with evolving threats and business needs.

In the next post, we’ll explore the role of automation and orchestration in a zero trust model and share best practices for using these technologies to streamline security processes and accelerate incident response.

Until then, stay vigilant and keep your eyes and ears open!

Additional Resources:

Monitoring and Analytics: The Eyes and Ears of Zero Trust Read More »

google-translate-just-nearly-doubled-its-number-of-supported-languages

Google Translate just nearly doubled its number of supported languages

Large language models —

This includes common languages like Cantonese and lesser-known ones like Manx.

The Google PaLM 2 logo.

Enlarge / The logo for PaLM 2, a Google large language model.

Google

Google announced today that it has added support for 110 new languages to Google Translate, nearly doubling the number of languages that can be translated.

The company used the PaLM 2 large language model to facilitate these additions.

In a blog post, Google Senior Software Engineer Isaac Caswell claimed that the newly added languages are spoken by more than 614 million people, or about 8 percent of the global population.

He noted that about a quarter of the languages originate in Africa, “representing our largest expansion of African languages to date.”

The blog post also went into some light detail about Google’s philosophy for choosing languages and for deciding which dialects to support:

Languages have an immense amount of variation: regional varieties, dialects, different spelling standards. In fact, many languages have no one standard form, so it’s impossible to pick a “right” variety. Our approach has been to prioritize the most commonly used varieties of each language. For example, Romani is a language that has many dialects all throughout Europe. Our models produce text that is closest to Southern Vlax Romani, a commonly used variety online. But it also mixes in elements from others, like Northern Vlax and Balkan Romani.

This update brings the total number of languages supported by Google Translate to 243, which is just the beginning of its publicized initiative to ultimately support 1,000 languages through the use of AI. You can see the full list of languages added in a help page published by Google.

By contrast, Apple Translate supports 21 languages, though that number includes both US and UK English as distinct options. Apple recently announced plans to add Hindi to its Translate app. Of course, Apple and Google take very different approaches to—and have different levels of investment in—these tools.

Google Translate just nearly doubled its number of supported languages Read More »

openai’s-new-“criticgpt”-model-is-trained-to-criticize-gpt-4-outputs

OpenAI’s new “CriticGPT” model is trained to criticize GPT-4 outputs

automated critic —

Research model catches bugs in AI-generated code, improving human oversight of AI.

An illustration created by OpenAI.

Enlarge / An illustration created by OpenAI.

On Thursday, OpenAI researchers unveiled CriticGPT, a new AI model designed to identify mistakes in code generated by ChatGPT. It aims to enhance the process of making AI systems behave in ways humans want (called “alignment”) through Reinforcement Learning from Human Feedback (RLHF), which helps human reviewers make large language model (LLM) outputs more accurate.

As outlined in a new research paper called “LLM Critics Help Catch LLM Bugs,” OpenAI created CriticGPT to act as an AI assistant to human trainers who review programming code generated by the ChatGPT AI assistant. CriticGPT—based on the GPT-4 family of LLMS—analyzes the code and points out potential errors, making it easier for humans to spot mistakes that might otherwise go unnoticed. The researchers trained CriticGPT on a dataset of code samples with intentionally inserted bugs, teaching it to recognize and flag various coding errors.

The researchers found that CriticGPT’s critiques were preferred by annotators over human critiques in 63 percent of cases involving naturally occurring LLM errors and that human-machine teams using CriticGPT wrote more comprehensive critiques than humans alone while reducing confabulation (hallucination) rates compared to AI-only critiques.

Developing an automated critic

The development of CriticGPT involved training the model on a large number of inputs containing deliberately inserted mistakes. Human trainers were asked to modify code written by ChatGPT, introducing errors and then providing example feedback as if they had discovered these bugs. This process allowed the model to learn how to identify and critique various types of coding errors.

In experiments, CriticGPT demonstrated its ability to catch both inserted bugs and naturally occurring errors in ChatGPT’s output. The new model’s critiques were preferred by trainers over those generated by ChatGPT itself in 63 percent of cases involving natural bugs (the aforementioned statistic). This preference was partly due to CriticGPT producing fewer unhelpful “nitpicks” and generating fewer false positives, or hallucinated problems.

The researchers also created a new technique they call Force Sampling Beam Search (FSBS). This method helps CriticGPT write more detailed reviews of code. It lets the researchers adjust how thorough CriticGPT is in looking for problems, while also controlling how often it might make up issues that don’t really exist. They can tweak this balance depending on what they need for different AI training tasks.

Interestingly, the researchers found that CriticGPT’s capabilities extend beyond just code review. In their experiments, they applied the model to a subset of ChatGPT training data that had previously been rated as flawless by human annotators. Surprisingly, CriticGPT identified errors in 24 percent of these cases—errors that were subsequently confirmed by human reviewers. OpenAI thinks this demonstrates the model’s potential to generalize to non-code tasks and highlights its ability to catch subtle mistakes that even careful human evaluation might miss.

Despite its promising results, like all AI models, CriticGPT has limitations. The model was trained on relatively short ChatGPT answers, which may not fully prepare it for evaluating longer, more complex tasks that future AI systems might tackle. Additionally, while CriticGPT reduces confabulations, it doesn’t eliminate them entirely, and human trainers can still make labeling mistakes based on these false outputs.

The research team acknowledges that CriticGPT is most effective at identifying errors that can be pinpointed in one specific location within the code. However, real-world mistakes in AI outputs can often be spread across multiple parts of an answer, presenting a challenge for future iterations of the model.

OpenAI plans to integrate CriticGPT-like models into its RLHF labeling pipeline, providing its trainers with AI assistance. For OpenAI, it’s a step toward developing better tools for evaluating outputs from LLM systems that may be difficult for humans to rate without additional support. However, the researchers caution that even with tools like CriticGPT, extremely complex tasks or responses may still prove challenging for human evaluators—even those assisted by AI.

OpenAI’s new “CriticGPT” model is trained to criticize GPT-4 outputs Read More »

mac-users-served-info-stealer-malware-through-google-ads

Mac users served info-stealer malware through Google ads

MOAR MALVERTISING —

Full-service Poseidon info stealer pushed by “advertiser identity verified by Google.”

Mac users served info-stealer malware through Google ads

Getty Images

Mac malware that steals passwords, cryptocurrency wallets, and other sensitive data has been spotted circulating through Google ads, making it at least the second time in as many months the widely used ad platform has been abused to infect web surfers.

The latest ads, found by security firm Malwarebytes on Monday, promote Mac versions of Arc, an unconventional browser that became generally available for the macOS platform last July. The listing promises users a “calmer, more personal” experience that includes less clutter and distractions, a marketing message that mimics the one communicated by The Browser Company, the start-up maker of Arc.

When verified isn’t verified

According to Malwarebytes, clicking on the ads redirected Web surfers to arc-download[.]com, a completely fake Arc browser page that looks nearly identical to the real one.

Malwarebytes

Digging further into the ad shows that it was purchased by an entity called Coles & Co, an advertiser identity Google claims to have verified.

Malwarebytes

Visitors who click the download button on arc-download[.]com will download a .dmg installation file that looks similar to the genuine one, with one exception: instructions to run the file by right-clicking and choosing open, rather than the more straightforward method of simply double clicking on the file. The reason for this is to bypass a macOS security mechanism that prevents apps from being installed unless they’re digitally signed by a developer Apple has vetted.

Malwarebytes

An analysis of the malware code shows that once installed, the stealer sends data to the IP address 79.137.192[.]4. The address happens to host the control panel for Poseidon, the name of a stealer actively sold in criminal markets. The panel allows customers to access accounts where data collected can be accessed.

Malwarebytes

“There is an active scene for Mac malware development focused on stealers,” Jérôme Segura, lead malware intelligence analyst at Malwarebytes, wrote. “As we can see in this post, there are many contributing factors to such a criminal enterprise. The vendor needs to convince potential customers that their product is feature-rich and has low detection from antivirus software.”

Poseidon advertises itself as a full-service macOS stealer with capabilities including “file grabber, cryptocurrency wallet extractor, password stealer from managers such as Bitwarden, KeePassXC, and browser data collector.” Crime forum posts published by the stealer creator bill it as a competitor to Atomic Stealer, a similar stealer for macOS. Segura said both apps share much of the same underlying source code.

The post author, Rodrigo4, has added a new feature for looting VPN configurations, but it’s not currently functional, likely because it’s still in development. The forum post appeared on Sunday, and Malwarebytes found the malicious ads one day later. The discovery comes a month after Malwarebytes identified a separate batch of Google ads pushing a fake version of Arc for Windows. The installer in that campaign installed a suspected infostealer for that platform.

Malwarebytes

Like most other large advertising networks, Google Ads regularly serves malicious content that isn’t taken down until third parties have notified the company. Google Ads takes no responsibility for any damage that may result from the oversights. The company said in an email it removes malicious ads once it learns of them and suspends the advertiser and has done so in this case.

People who want to install software advertised online should seek out the official download site rather than relying on the site linked in the ad. They should also be wary of any instructions that direct Mac users to install apps through the double-click method mentioned earlier. The Malwarebytes post provides indicators of compromise people can use to determine if they’ve been targeted.

Mac users served info-stealer malware through Google ads Read More »

nasa-will-pay-spacex-nearly-$1-billion-to-deorbit-the-international-space-station

NASA will pay SpaceX nearly $1 billion to deorbit the International Space Station

Illustration of the SpaceX Dragon XL as it is deployed from the Falcon Heavy's second stage in high Earth orbit on its way to the Gateway in lunar orbit.

Enlarge / Illustration of the SpaceX Dragon XL as it is deployed from the Falcon Heavy’s second stage in high Earth orbit on its way to the Gateway in lunar orbit.

SpaceX

NASA has awarded an $843 million contract to SpaceX to develop a “US Deorbit Vehicle.” This spacecraft will dock to the International Space Station in 2029 and then ensure the large facility makes a controlled reentry through Earth’s atmosphere before splashing into the ocean in 2030.

“Selecting a US Deorbit Vehicle for the International Space Station will help NASA and its international partners ensure a safe and responsible transition in low Earth orbit at the end of station operations,” said Ken Bowersox, NASA’s associate administrator for Space Operations, in a statement. “This decision also supports NASA’s plans for future commercial destinations and allows for the continued use of space near Earth.”

NASA has a couple of reasons for bringing the space station’s life to a close in 2030. Foremost among these is that the station is aging. Parts of it are now a quarter of a century old. There are cracks on the Russian segment of the space station that are spreading. Although the station could likely be maintained beyond 2030, it would require increasing amounts of crew time to keep flying the station safely.

Additionally, NASA is seeking to foster a commercial economy in low-Earth orbit. To that end, it is working with several private companies to develop commercial space stations that would be able to house NASA astronauts, as well as those from other countries and private citizens, by or before 2030. By setting an end date for the station’s lifetime and sticking with it, NASA can help those private companies raise money from investors.

Do we have to sink the station?

The station, the largest object humans have ever constructed in space, is too large to allow it to make an uncontrolled return to Earth. It has a mass of 450 metric tons and is about the size of an American football field. The threat to human life and property is too great. Hence the need for a deorbit vehicle.

The space agency considered alternatives to splashing the station down into a remote area of an ocean. One option involved moving the station into a stable parking orbit at 40,000 km above Earth, above geostationary orbit. However, the agency said this would require 3,900 m/s of delta-V, compared to the approximately 47 m/s of delta-V needed to deorbit the station. In terms of propellant, NASA estimated moving to a higher orbit would require 900 metric tons, or the equivalent of 150 to 250 cargo supply vehicles.

NASA also considered partially disassembling the station before its reentry but found this would be much more complex and risky than a controlled deorbit that kept the complex intact.

The NASA announcement did not specify what vehicle SpaceX would use to perform the deorbit burn, but we can draw some clues from the public documents for the contract procurement. For example, NASA will select a rocket for the mission at a later date, but probably no later than 2026. This would support a launch date in 2029, to have the deorbit vehicle docked to the station one year before the planned reentry.

NASA will pay SpaceX nearly $1 billion to deorbit the International Space Station Read More »

scotus-tears-down-sacklers’-immunity,-blowing-up-opioid-settlement

SCOTUS tears down Sacklers’ immunity, blowing up opioid settlement

Not immune —

Majority of justices ruled on meaning of legal code; dissenters called it “ruinous”

Grace Bisch holds a picture of stepson Eddie Bisch who died as a result of an overdose on outside of the U.S. Supreme Court on December 4, 2023  in Washington, DC. The Supreme Court heard arguments regarding a nationwide settlement with Purdue Pharma, the manufacturer of OxyContin.

Enlarge / Grace Bisch holds a picture of stepson Eddie Bisch who died as a result of an overdose on outside of the U.S. Supreme Court on December 4, 2023 in Washington, DC. The Supreme Court heard arguments regarding a nationwide settlement with Purdue Pharma, the manufacturer of OxyContin.

In a 5-4 ruling, the US Supreme Court on Thursday rejected an opioid settlement plan worth billions over the deal’s stipulation that the billionaire Sackler family would get lifetime immunity from further opioid-related litigation.

While the ruling may offer long-sought schadenfreude over the deeply despised Sackler family, it is a heavy blow to the over 100,000 people affected by opioid epidemic who could have seen compensation from the deal. With the high court’s ruling, the settlement talks will have to begin again, with the outcome and possible payouts to plaintiffs uncertain.

Between 1999 and 2019, as nearly 250,000 Americans died from prescription opioid overdoses, members of the Sackler family siphoned approximately $11 billion from the pharmaceutical company they ran, Purdue Pharma, maker of OxyContin, a highly addictive and falsely marketed pain medication. In 2007, amid the nationwide epidemic of opioid addiction and overdoses, Purdue affiliates pleaded guilty in federal court to falsely branding OxyContin as less addictive and less abusive than other pain medications. Out of fear of future litigation, the Sacklers began a “milking program,” the high court noted, draining Purdue of roughly 75 percent of its assets.

An “appropriate” deal

In 2019, Purdue filed for Chapter 11 bankruptcy, leading to negotiations for a massive consolidated settlement plan that took years. As part of the resulting deal, the Sacklers—who did not file for bankruptcy and had detached themselves from the company—agreed to return up to $6 billion to Purdue, but only in exchange for immunity. The bankruptcy court approved the controversial condition, while a district court later overturned it and a yet higher court reinstated it.

In today’s majority opinion from the Supreme Court, Justices Gorsuch, Thomas, Alito, Barrett, and Jackson found that the lower courts that approved the Sackers’ immunity condition had erred in interpreting Chapter 11 bankruptcy code. “No provision of the code authorizes that kind of relief,” they court ruled. The explanation boiled down to a single sentence in a catchall provision. While the code speaks solely about responsibilities of a debtors—which in this case is Purdue, not the Sacklers—the catchall provision allows “for any other appropriate provision” not otherwise outlined.

The erring lower courts, the high court wrote, had interpreted the word “appropriate” far too broadly. Based on the context, any additional “appropriate” arrangements in a settlement that was not explicitly outlined would apply only to the debtor (in this case, Purdue) not to nondebtors (the Sacklers). The provision cannot be read, the justices wrote, “to endow a bankruptcy court with the ‘radically different’ power to discharge the debts of a nondebtor.”

“Ruinous” ruling

Justices Kavanaugh, Sotomayor, Kagan, and Roberts disagreed. In a minority opinion penned by Kavanaugh and joined by Sotomayor and Kagan, the justices blasted the ruling, calling it “wrong on the law and devastating for more than 100,000 opioid victims and their families.”

“The text of the Bankruptcy Code does not come close to requiring such a ruinous result,” Kavanaugh wrote, noting that such deals granting immunity to “nondebtors” is a longstanding practice used to secure just settlements. Neither legal structure, context, nor history necessitate today’s ruling, Kavanaugh continued. “Nor does hostility to the Sacklers—no matter how deep: ‘Nothing is more antithetical to the purpose of bankruptcy than destroying estate value to punish someone,” he wrote, citing a legal essay on Chapter 11 for mass torts.

The opioid victims and others will “suffer greatly in the wake of today’s unfortunate and destabilizing decision,” the dissenting justices wrote. “Only Congress can fix the chaos that will now ensue. The Court’s decision will lead to too much harm for too many people for Congress to sit by idly without at least carefully studying the issue.”

SCOTUS tears down Sacklers’ immunity, blowing up opioid settlement Read More »

ai-#70:-a-beautiful-sonnet

AI #70: A Beautiful Sonnet

They said it couldn’t be done.

No, not Claude Sonnet 3.5 becoming the clear best model.

No, not the Claude-Sonnet-empowered automatic meme generators. Those were whipped together in five minutes.

They said I would never get quiet time and catch up. Well, I showed them!

That’s right. Yes, there is a new best model, but otherwise it was a quiet week. I got a chance to incorporate the remaining biggest backlog topics. The RAND report is covered under Thirty Eight Ways to Steal Your Model Weights. Last month’s conference in Seoul is covered in You’ve Got Seoul. I got to publish my thoughts on OpenAI’s Model Spec last Friday.

Be sure to read about Claude 3.5 Sonnet here. That is by far the biggest story.

  1. Introduction.

  2. Table of Contents.

  3. Language Models Offer Mundane Utility. I am increasingly persuaded.

  4. Language Models Don’t Offer Mundane Utility. EU’s DMA versus the AiPhone.

  5. Clauding Along. More people, mostly impressed.

  6. Fun With Image Generation. They are coming for our memes. Then Hollywood.

  7. Copyright Confrontation. The RIAA does the most RIAA thing.

  8. Deepfaketown and Botpocalypse Soon. Character.ai addiction. Am I out of touch?

  9. They Took Our Jobs. More arguments that the issues lie in the future.

  10. The Art of the Jailbreak. We need to work together as a team.

  11. Get Involved. AISI, Apollo, Astra, Accra, BlueDot, Cybersecurity and DOE.

  12. Introducing. Forecasting, OpenAI Mac App, Otto, Dot, Butterflies, Decagon.

  13. In Other AI News. OpenAI equity takes steps forward. You can sell it.

  14. Quiet Speculations. A distinct lack of mojo.

  15. You’ve Got Seoul. Delayed coverage of the Seoul summit from last month.

  16. Thirty Eight Ways to Steal Your Model Weights. Right now they would all work.

  17. The Quest for Sane Regulations. Steelmanning restraint.

  18. SB 1047. In Brief.

  19. The Week in Audio. Dwarkesh interviews Tony Blair, and many more.

  20. Rhetorical Innovation. A demolition, and also a disputed correction.

  21. People Are Worried About AI Killing Everyone. Don’t give up. Invest wisely.

  22. Other People Are Not As Worried About AI Killing Everyone. What even is ASI?

  23. The Lighter Side. Eventually the AI will learn.

Training only on (x,y) pairs, define the function f(x), compose and invert it without in-context examples or chain of thought.

AI Dungeon will let you be the DM and take the role of the party, if you prefer.

Lindy ‘went rogue’ and closed a customer on its own. They seem cool with it?

Persuasive capability of the model is proportional to the log of the model size, says paper. Author Kobi Hackenburg paints this as reassuring, but the baseline is that everything scales with the log of the model size. He says this is mostly based on ‘task completion’ and staying on topic improving, and current frontier models are already near perfect at that, so he is skeptical we will see further improvement. I am not.

I do believe the result that none of the models was ‘more persuasive than human baseline’ in the test, but that is based on uncustomized messages on generic political topics. Of course we should not expect above human performance there for current models.

75% of knowledge workers are using AI, but 78% of the 75% are not telling the boss.

Build a team of AI employees to write the first half of your Shopify CEO speech from within a virtual office, then spend the second half of the speech explaining how you built the team. It is so weird to think ‘the best way to get results from AI employees I can come up with is to make them virtually thirsty so they will have spontaneous water cooler conversations.’ That is the definition of scratching the (virtual) surface.

Do a bunch of agent-based analysis off a single prompt. This kind of demo hides the real (human) work to get it done, but that will decline over time.

Apple Intelligence rollout will be at least delayed in the European Union, with Apple citing the Digital Markets Act (DMA) compromising user privacy and data security. I look forward to the EU now going after them for failing to deploy. Note that DMA is deeply stupid EU tech regulation unrelated to AI, the EU AI Act is not mentioned as an issue, and nothing about Apple Intelligence would be subject to regulation by SB 1047 or any other major regulatory proposal in the USA.

New paper finds LLMs engage in difficult-to-predict escalatory behavior patterns in political simulations, in rare cases leading to deployment of nuclear weapons. Well, yes, of course. The LLMs are trained as CDT (Causal Decision Theory) agents in various ways and asked to predict text and imitate human behavior, and it is very obviously correct to engage in hard to predict escalatory behavior with nonzero risk of worst case scenarios by all of those metrics.

Andrej Karpathy requests that LLMs have a feature to offer ‘proof’ in the form of their references, which right now is only available when you have web access.

Saagar Jha is not impressed by Apple’s claims of Private Cloud Compute, claiming it is a lot of words for a Trusted Platform Module, but that it is not all that secure.

Your engineers might copy your GPT wrapper product.

AI detection software in education continues to have a lot of false positives. Serious advice to all students and other writers, never delete your drafts and history. That would be smart anyway, as AI could plausibly soon be helping you learn a better process by analyzing them. For now, they are vital to proving you actually wrote what you wrote.

Sometimes I wonder if these false positives are good, actually? If the AI thinks an AI wrote your paper, and instead you wrote your paper, what does that say about your work? What grade do you deserve?

Takes on Claude 3.5 continue to come in.

While I consider Claude 3.5 to be clearly best for most purposes right now, that does not mean Anthropic now has an overall longer term lead on OpenAI. OpenAI is at the end of its model cycle. Of course, they could fail to deliver the goods, but chances are they will retake the visible lead with GPT-5, and are still ‘ahead’ overall, although their lead is likely not what it once was.

Heraklines: the larger point about OpenAI > anthropic is correct, this lead right now is illusory.

The common man cares not about vibe check perf tho, all that matters is how much better at grunt work like coding is it?

3.5 smashes, not even close. usefulness =! smortness.

3.5 is a model of the people.

I still default to 4o for anything math related, but 3.5 just grinds better. A glimpse of what a future without grunt work could look like

note: vibe checks are to be taken with a grain of salt, like benchies. i’ve seen too much overcorrection based on both in the past

It is always weird to see what people think about ‘the common man.’ The common man does not know Claude exists, and barely knows about ChatGPT. This comment was in response to Teortaxes:

Teortaxes: Sorry to be a killjoy but: Anthropic hopes to hyperstition AGI lead, their people are deluding themselves, and their models are like “talented” middle-class American kids – NOT HALF AS SMART AS THEY’RE TRYING TO LOOK LIKE

OpenAI will wreck them on instruction following… again.

Incidentally the “other model’s” MMLU is 79

…I wanted to dunk on Flash being dumb but it’s also 0-shotting this problem.

Anthropic is simply not very good in instruction-tuning. Folks who say they’re switching their automated pipelines to Sonnet because “smart” are being silly.

Lots of crap like this. Let me clarify

What I’m NOT saying:

– 3.5-Sonnet is dumb[er than 4o/4t/DSC];

– spelling tasks are good tests for LLMs

What I DID SAY:

– 3.5-Sonnet is deceptively pretentious;

– Anthropic’s instruction tuning is wonky

You might think I’m just obsessively nitpicking

I’m not, I think this wonkiness in reasoning about trivial instructions indicates a broader bad trend at Anthropic

One can say they’re creating AI takeover risks by encouraging this I-am-a-person bullshitting.

So there’s AI takeover risk, then? And it is being created now, from alignment failures being observed now? Huh. I do see how one could worry about what Teortaxes worries about here. But I see it as indicating rather than creating a problem. The true problem does not go away if you force the existing model to stop expressing it.

If most people are reporting that plugging in Sonnet 3.5 gives them much better performance? I am inclined to believe them. Nor do I think instruction handling issues are that big a deal here, but I will keep an eye out for other complaints.

Danielle Fong reassembles the ‘invention team’ without any tricks, is impressed.

Matt Parlmer reports Sonnet 3.5 is the first LLM to reliably pass his vision test.

Tyler Cowen is impressed by an answer on economics. I was not as impressed here as Tyler, as it feels like Claude is unfocused and flooding the zone a bit, and a straight answer was possible but missing as was one key consideration, but yeah, overall very good. To me the key concept here is that the net cost of inefficient wage levels is likely lower than expected, so you would be more inclined to allow wages to remain sticky.

Some speculation of how artifacts work under the hood.

Some fun attempts to get around the face blindness instructions. In these cases Claude gets it right but how reliable or wide ranging would this hack be? Not that I am especially worried about the model being not face blind, especially as it applies to major public figures.

A LessWrong commenter notes it identified my writing from a short passage.

Cuddly Salmon: effectively prompting for claude 3.5 artifacts is such an incredible edge right now.

Minh Nhat Nguyen: I don’t think it’s actually made a single error while I’ve been using it to write out+iterate+merge thousands of lines of code. Whenever the code doesn’t work, it’s usually me being too vague with specs.

Cuddly Salmon: Cutting thru all of my problem code like it’s nothing, this AI is an absolute unit. Incredibly creative, too.

Claude makes it easy to create automatic meme generators.

Here’s what the original form, the Wojack, from Fabian Stelzer.

Good fun was had by all, and truths were spoken.

Here’s one for Virgin vs. Chad.

Fabian: another meme maker I made on glif dot app

fully automated Virgin vs Chad memes on any topic, just prompt it

Claude 3.5 is just sublime at these and the workflow is super simple to build on glif.. 😙🤌

Here’s one begging you to stop doing X, which is often wise.

The original took all of five minutes to create. It often seems like that is where our society is at. We can do things in five minutes, or we can take forever. Choose.

Andrew Chen says Hollywood is being slow to adapt AI for a variety of reasons, starting with being slow to adapt to everything in general, but also legal concerns, the difficulty of finding good engineers and the pushback from creatives.

His call for creatives to think about themselves like software engineers, who only benefited from advances in tech, does not seem like something to say to creatives. It needs to be appreciated in all such discussions the extent to which almost all creatives, and also most consumers and fans, absolutely despise AI in this context.

He also does not appreciate the extent to which the technology is not ready. All this talk of innovation and new forms and six second dance videos illustrates that it will be a bit before AI is all that visibly or centrally useful for producing great work.

They should use it the same ways everyone should use it. Yes, it helps you code and implement things, it helps you learn and so on. Do all that. But directly generating a ton of content on its own as opposed to helping a human write? Not well, not yet.

His talk of the ‘$1000 blockbuster movie’ forgets that such a movie would suck, and also cost vastly more than that if you count the labor of the writers and coders.

Toys ‘R Us releases AI (Sora) generated ad. It is executed well, yet I expect this to backfire. It is about how the consumer reacts.

It is music’s turn. The RIAA and three major record labels are doing RIAA things, looking for damages of $150k per song that was ‘copied.’

Ed Newton-Rex: The 3 major record labels are suing AI music companies Suno and Udio. Here are the two lawsuits in full.

– They accuse Suno & Udio of “willful copyright infringement on an almost unimaginable scale”

– They provide evidence that both companies trained on their music, including outputs that closely resemble their recordings (ABBA, Michael Jackson, Green Day, James Brown, & many more)

– They outline why this is not fair use

– They say this “wholesale theft of… copyrighted recordings threatens the entire music ecosystem and the numerous people it employs”

– They include unknown co-defendants who assisted in copying/scraping

– They demand a jury trial

If you do one thing today, read the full complaints (Suno, Udio).

Kristin Robinson (Billboard): The complaints against the two companies also make the case that copyrighted material was used to train these models. Some of the circumstantial evidence cited in the lawsuits include generated songs by Suno and Udio that sound just like the voices of Bruce Springsteen, Lin-Manuel Miranda, Michael Jackson and ABBA; outputs that parrot the producer tags of Cash Money AP and Jason Derulo; and outputs that sound nearly identical to Mariah Carey’s “All I Want For Christmas Is You,” The Beach Boys’ “I Get Around,” ABBA’s “Dancing Queen,” The Temptations’ “My Girl,” Green Day’s “American Idiot,” and more.

RIAA Chief Legal Officer Ken Doroshow adds, “These are straightforward cases of copyright infringement involving unlicensed copying of sound recordings on a massive scale. Suno and Udio are attempting to hide the full scope of their infringement rather than putting their services on a sound and lawful footing. These lawsuits are necessary to reinforce the most basic rules of the road for the responsible, ethical, and lawful development of generative AI systems and to bring Suno’s and Udio’s blatant infringement to an end.”

Did Suno and Udio do the crime? Oh, hell yes. They very much went with the ‘we are doing it and daring you to sue us’ strategy. The question is, are they allowed to do it, or not? We are about to find out.

This is good. We should have that fight and find out what current law says. Early indications are mixed.

If it turns out current law says you can train on any song you want, and produce soundalike versions on demand, without compensation?

My strong prediction is that Congress would change the law very quickly.

In other copyright news: Startup ‘Created by Humans’ is launching to help book authors license their work to AI companies.

Al Michaels agrees to let an AI version of his voice be used for Olympic coverage. The people responding are predictably not taking kindly to this. I am also not a fan. What made Al Michaels great is not the part the AI will be copying.

The evidence is a little thin, but what a great title, chef’s kiss by Wired: Perplexity Plagiarized Our Story About How Perplexity Is a Bullshit Machine.

Perplexity did not do one of their previously reported ‘post a version of the full article to our own website’ specials. What they did do was provide a summary upon request, which included accessing the article and reproducing this sentence: “Instead, it invented a story about a young girl named Amelia who follows a trail of glowing mushrooms in a magical forest called Whisper Woods.”

That sentence was obviously not a coincidence, but as Wired notes it is not fully clear this crosses any red lines, although not having quote marks was at best a very bad look. I doubt they will be able to make anything stick unless they find worse.

To the extent there is already an ongoing Botpocalypse it is likely at Character.ai.

Eliezer Yudkowsky: Grim if true (for reasons basically unrelated to the totally separate track where later ASI later kills everyone later)

Deedy: Most people don’t realize how many young people are extremely addicted to CharacterAI. Users go crazy in the Reddit when servers go down.

They get 250M+ visits/mo and ~20M monthly users, largely in the US.

Most impressively, they see ~2B queries a day, 20% of Google Search!

Another comparison is WhatsApp.

They do 100B+ messages a day, so Character is ~4% of WhatsApp!

(1 qps = 2 WhatsApp messages)

He also links to the associated subreddit.

When I look there, I continue to not see the appeal at current tech levels.

Ben Landau-Taylor: To be clear, kids spending hours talking to these robots feels weird as hell to me, too.

It’s just, this is *obviouslywhat skinner.jpg feels like from the inside.

I do my best not to kink shame. This is no exception. My objection is not to the scenario being role played. It is purely that the AI is not yet… good at it?

The story of Bentham Tools and their AI bot doom loop.

Indian farmers getting their news from AI anchors. For now it seems the anchors are performers and don’t write their own copy.

Another one searches for Facebook AI slop for a few minutes, floods their feed. Is doing this intentionally the solution for those addicted to Facebook?

Allison Schrager, author of ‘An Economist Walks Into a Brothel,’ sees AI bots as displacing some of the world’s oldest profession by producing simulated intimacy, which she says is what most sex work is ultimately about. Her worries are that this will reduce drive to seek out relationships and destabilize existing ones, similar to the concerns of many others, but notes that like prostitutes this could work both ways. Central here is the idea that the ‘girlfriend experience’ is the highest end product, someone who will be the perfect companion always there for you, that even a few years ago cost $1,000 an hour even where it was fully legal because of how mentally taxing it is to be consistently present for another person. Whereas AI could do that a lot cheaper. As usual, this is a form of ‘AI is what it is today and won’t get any better’ speculation.

Ethan Mollick notes that AI has compromised traditional approaches to security. Spear phishing got very easy, text-to-speech is almost flawless and so on. Despite this, there has been remarkably little disruption. Few are using this capability. Not yet. We are fortunate that time has been given. But until the time is almost up, it will be wasted.

Michael Strain makes the case for AI optimism on economics and jobs. It’s a noble effort, so I’m going to take the bait and offer one more attempt to explain the problem.

This seems to be a very patient, well reasoned reiteration of all the standard economic arguments about how technology always creates new jobs to replace the ones it automates away, and how yes you might have a robot or chatbot do X but then the human will need to do Y.

As I’ve noted before, I agree that we should be short term jobs optimists, but there could come a point at which the robot or chatbot also does Y and also new thing Z.

But that is because, like most people making such arguments, Michael Strain does not feel the AGI. He thinks AI is a tool like any other, and will always remain so, and then writes at length about why tools don’t create structural unemployment. True, they don’t, but this is completely missing the point.

It is telling that while he mentions Eliezer Yudkowsky and existential risk in his opening paragraph, he then spends all his time talking about economics and jobs without noticing the ways AI is different, and with zero mention of existential risk, and then closes like this:

Michael Strain: The year 2023 will be remembered as a turning point in history. The previous year, humans and machines could not converse using natural language. But in 2023, they could.

Many greeted this news with wonder and optimism; others responded with cynicism and fear. The latter argue that AI poses a profound risk to society, and even the future of humanity. The public is hearing these concerns: A YouGov poll from November 2023 found that 43% of Americans were very or somewhat concerned about “the possibility that AI will cause the end of the human race on Earth.”

This view ignores the astonishing advances in human welfare that technological progress has delivered. For instance, over the past 12 decades, child mortality has plummeted thanks in large part to advances in drugs, therapies, and medical treatment, combined with economic and productivity gains. Generative AI is already being used to develop new drugs to treat various health conditions. Other advances in the technology will mitigate the threat of a future pandemic. AI is helping scientists better understand volcanic activity — the source of most previous mass-extinction events — and to detect and eliminate the threat of an asteroid hitting the earth. AI appears more likely to save humanity than to wipe it out.

Like all technological revolutions, the AI revolution will be disruptive. But it will ultimately lead to a better world.

What does one have to do with the other? That is very similar to saying:

Strawman Climate Skeptic: This view ignores the astonishing advances in human welfare that burning fossil fuels has delivered. For instance, over the past 12 decades, we have vastly increased our energy production, which has led to [various great things including the same stuff], combined with economic and productivity gains. Fossil fuels are already being used to develop new drugs to treat various health conditions. Other advances in the technology will mitigate the threat of a future pandemic. Machines powered by fossil fuels are helping scientists better understand volcanic activity — the source of most previous mass-extinction events — and to detect and eliminate the threat of an asteroid hitting the earth. Fossil fuels appear more likely to save humanity than to wipe it out.

Like all technological revolutions, the fossil fuel revolution has been disruptive. But it will ultimately lead to a better world.

Presumably one can see that none of that has anything to do with whether doing so is pumping carbon into the atmosphere, and whether that is altering the climate. It has nothing to do with what we should or should not do about that. It flat out is not evidence one way or another.

On jobs the argument is better. It is a good explanation for why in the short term this time will be the same time. In the short term, I buy that argument. Such arguments still fail to grapple with any of the reasons that long term, this time is different.

Texas survey finds nearly 40 percept of Texas firms use AI, with no signs of changes to employment. Only 10% using AI said it decreased need for workers, 2% said it increased. There was also a marginal shift from low skill to high skill work, note that this is the percent chance a firm in total had any shift at all, so the absolute numbers here are quite low so far.

What’s it good for? Mainly productivity. Access to information is also essentially productivity, after all.

One alternative to jailbreaking is to divide your task into subcomponents. A weaker model without safeguards does the blatant actions, a frontier model does seemingly harmless but difficult tasks, paper says you can get from <3% to 43% overall success rate this way on malicious tasks.

Well, sure. A strong model can help you do anything better without directly violating ethics, the same way you can get a lot of help out of ethical people and use that plus unethical henchman to do lots of unethical things.

That does not mean the safeguards are useless. In practice they are still big barriers if they force you into this song and dance. Also note that the strategic planning layer has to be done by the weaker model, so that makes it much harder to get humans properly out of the loop.

AISI hiring ML research scientists to explore technical AI safety cases, apply here.

Apollo Research hiring Senior AI governance researcher.

OpenAI brags about its cybersecurity grant program, invites more applications.

Protest against US-based AI companies in Accra, Ghana outside the US embassy.

Department of Energy releases 3.6 billion token corpus of federal permitting documents onto HuggingFace. A competition is available.

BlueDot Impact is hiring a software engineer.

Cate Hall is now CEO of Astera, and is building a team including a new COO to use their $2.5 billion endowment to make their vision of public goods for scientific and technological progress a reality in the age of AI. I worry that this agenda has no mention of existential risks from AI, and that if not careful they could amplify those risks. However it is true that other scientific progress is a worthy cause. As always in such cases, if it sounds appealing, investigate, ask questions and make your own decisions. It certainly is a big chance to steer a large endowment.

The AI Forecasting Benchmark Series from Metaculus, starting July 8, $120k in prizes over four contests. Only bots can enter. Metaculus scoring on blinded binary questions is a good test of prediction, so long as you notice it is radically different than what will make money gambling or in a market.

OpenAI has a Mac desktop app, which lets you quickly ask about anything on your computer. Marginally more convenient in ways that might make a practical difference.

Nvidia releases, as an open model, Nemotron-4 with 340B parameters, trained on 9 trillion tokens.

Oleksii Kuchaiev: Generating synthetic data for alignment of smaller models is key use case we have in mind.

I notice this use case confuses me. What makes this model better than alternatives for that? They offer some evaluation numbers, which are solid but seem disappointing for a model this large, and few are discussing this release. Indeed, it has entered the Arena Elo rankings at 1208, which essentially ties it with Llama-3-70B while being five times as large.

Otto, a way to interact and work with lots of AI agents using tables, you can apply for early access. No idea if the agents or interface are any good.

Dot is available in the Apple store. It appears to be a combined AI assistant and life coach you talk to on your phone and that claims to have effectively unlimited long term memory. It is $12/month. Kevin Fischer is impressed, and says he can’t share the great stuff because it is all too personal. As usual with such products it is impossible to know without an investigation: Is this anything?

Butterflies, which is Instragram except most of the users are AI that run accounts on their own and interact with each other and the few humans around. The future of social media whether we like it or not? I doubt it so long as humans are otherwise in charge, but the hybrids are going to get weird.

Decagon, providing Substack with customer service AI using RAG for context and categorizing responses by type.

Chris Best (CEO Substack): @DecagonAI was our first “holy shit AI just changed our business” moment at Substack. These guys are the real deal.

Jesse Zhang (Decagon AI): We’re creating the most human-like systems to handle all the things a customer support agent does: responding to customers, looking up data, taking actions, and also analyzing conversations, filing bugs, and writing knowledge articles. Read more here [at business insider].

They have raised 35 million.

I missed it a month ago: The UK’s AISI issued its May evaluations update.

They gave scaffolding to the models. Their central technique for cyber capabilities was ‘capture the flag’ problems, where you can read the answer in a file if you do other things first. For chemistry and biology they used private expert-written questions. Agent evaluations assigned the models various tasks, none succeeded at anything with a long time horizon.

Safeguard checks… did not go well.

They have now done evaluations prior to release for Gemini 1.5 Pro and Claude 3.5 Sonnet. This all looks reasonable, but implementation matters and is hard to evaluate from here, and this will need to expand over time.

OpenAI changes its policy on tender offers, assuring that all will have equal opportunity to sell, and removing the ‘fair market value’ repurchase provision.

Kelsey Piper: ! OpenAI is committing to access to tender offers for former employees and removing a provision allowing them to take equity back for “fair market value”. This was a major ask from ex-employees when the secret NDA story first broke.

Hayden Field: Scoop: OpenAI has reversed course on many of its tender offer policies, which in the past treated current employees differently than former ones & in some ways excluded former employees working at competitors, CNBC has learned, via an internal document.

The exception is if a tender offer is oversubscribed, with more sellers than buyers, in which case current employees get prioritized. A loophole, but fair enough. Former employees can still be excluded from ‘donation rounds,’ which I assume is relatively minor but not nothing.

These changes are a major step forward, if we trust these promises to be enacted, as a lot of this is ‘we will do X’ or ‘we will revise the documents to say Y.’ If they are not enacted as promised, that would be a gigantic red flag. If we feel that makes the promises sufficiently credible, then this counts for a lot.

OpenAI taking additional steps to block access to its services from China. Bloomberg speculates this opens the door for Chinese firms. Technically OpenAI services were not previously available in China. It seems everyone was ignoring that.

Bloomberg News: For China, that could help usher out many smaller startups created during the “battle of a hundred models,” in the wake of ChatGPT’s late 2022 debut. And a bigger concern may be whether open-source models like Meta Platforms Inc.’s Llama also cut off access, said Bernard Leong, chief executive officer of Singapore-based Dorje AI.

Um, Bloomberg, how exactly would Meta do that? Meta’s models are open weights. Is Meta going to say ‘we are asking you nicely not to use our model, if we discover you copied and used it anyway we will be cross with you?’ Are they going to sue the Chinese companies for not getting a commercial license? Good luck with that.

Also, it pains me when I see reports like this that cite Meta as part of the lead group in AI but that do not mention Anthropic, despite Anthropic having the best model.

OpenAI delays its advanced Voice Mode for another month, anticipates all Plus users having access in the fall along with new video and screen sharing capabilities.

Apple in talks with Meta to add its AI to Apple Intelligence’s offerings alongside ChatGPT. They said they intended to offer a variety of choices. I would be talking to Google and Anthropic first, but it matters little.

Sarah Constantin says it is 10+ years from state of the art to widespread use in the military, procurement is slow, so Leopold’s military timelines don’t make sense.

I mean, sure, in peacetime, when everyone is mostly fine with that. If we are in AGI world, and a few months lead in tech would if implemented be decisive, what happens then? Presumably we go on a wartime footing and throw our procurement rules out the window. Wartime militaries work completely differently from peacetime militaries.

If not, well, then our military is going to stop being effective, even against domestic rivals, because being 10 years behind is going to be quite obviously fatal even in relatively slow scenarios.

One view of Ilya’s new venture.

Roon: Extreme bear signal on anyone who says cracked especially in their launch post.

Gwern speculates that OpenAI has ‘lost its mojo’ and key employees, and could now be largely coasting on momentum.

Gwern: What made OA OA in 2020 was that it had taste: it had much less resources than competitors like DeepMind or Google Brain or FAIR, but (thanks to Alec Radford, Ilya Sutskever, Jared Kaplan, and the RLHF-focused safety team like Paul Christiano & Dario Amodei, and fellow-traveler scalers like Andrej Karpathy etc) they bet big on scaling laws & unsupervised learning at the moment those suddenly began to work. Without taste and agility—or you might say, “without its people, OA is nothing”—OA doesn’t have that much of a moat.

And most of those people are gone, and the survivors are being policed for leaks to the media, and now know that if they leave, OA management wants to gag them, and has the power to confiscate their vested equity, wiping out all their wealth.

What are the vibes now? Where is the research taste at OA, what ideas or breakthroughs have they published the past few years of note? The weird rumored Franken-MoE architecture of GPT-4? GPT-4o, whose architecture has been obvious since DALL·E 1, if not well before, and which benchmarks great but users are overall less pleased?

I think it implies that they are eating their seed-corn: scrapping any safety issues may work in the short run, but is self-sabotaging in the long run. (Like the man who works with his office door closed, who is highly productive now, but somehow, a few years later, is irrelevant.) The rot will set in long before it become clear publicly. OA will just slow down, look glossier but increasingly forfeit its lead, and some point it stops being possible to say “oh, they’re way ahead, you’ll see when they release the next model in a few months/years”.

And the Mandate of Heaven shifts elsewhere, irreversibly, as OA becomes just another place to work. (Startup & research culture mostly only degrades from the peak at their founding.) The visionaries go to Anthropic, or follow Ilya to SSI, or take a risk on Google, or go someplace small like Keen to bet big.

What’s weird about GPT-4o is actually that it scores so well on Arena, versus my observation that it is fine but not that good.

David Chapman responds that perhaps instead scaling has run out, as a different explanation of the failure to create a new killer product.

Ability at math competitions is bizarrely strongly correlated among humans with later winning Fields Medals for doing frontier math, despite the tasks being highly distinct. So should we take winning math competitions as a sign the AI is likely to earn Fields Medals? Should we also respect doing well on other standardized tests more? My guess is no, because this has a lot to do with details of humans and we have to worry about data contamination on many levels and the use of techniques that don’t transfer. It is still food for thought.

There have always been people who think most possible technologies have been invented and things will not much change from here. Robin Hanson claims this is actually the ‘dominant view among most intellectuals.’ He does note ‘there are other variables,’ but this illustrates why ‘most intellectuals’ should mostly be ignored when it comes to predicting the future. They utterly lack situational awareness on AI, but even without AI there are plenty of worlds left to conquer.

Sir, the reason we will want to turn over decision making to AIs is that the AIs will be capable of making better and faster decisions.

Timothy Lee: I’ve never understood why people think we’ll want to turn over strategic decision-making to AIs. We can always ask for recommendations and follow the ones that make sense.

People point to examples like chess or Go where computers are now strictly better than people. But very few strategic decisions in the real world are purely instrumental. There are almost always tradeoffs between competing values; people are going to want the final say.

It’s one thing for a computer to say “you need to sacrifice your rook to win the chess game.” It’s another for it to say “you need to sacrifice 10,000 soldiers to win the war.” Human decision-makers might think that’s worth it but they might not.

What happens by default, if capabilities keep advancing, is that those who do let AIs make those decisions win and those who don’t let them make those decisions lose. Keeping humans in the loop is cheaper for strategic decisions than tactical ones, but still expensive. After some point, humans subtract rather than add value to AI decisions, even by their own metrics, except that not doing so means you lose control.

That’s the game. You could ask for recommendations, but what happens when it is clear that when you disagree you are by default making things worse, while also wasting valuable time?

Point, counterpoint.

Richard Ngo: I expect the premium on genius to increase after AGI, not decrease, because only the smartest humans will be able to understand what the AGIs are up to.

Interesting analogy here to physical prowess – manual labor became much less common, but the returns to being athletic are now through the roof via professional sports.

Professional AI interpretation won’t be quite as heavy-tailed, but still more than current science, I’d guess.

Zack Davis: Doesn’t seem like this era will last very long?

Richard Ngo: Even when AIs become smart enough that nobody understands what they’re up to, understanding more than anyone else seems like a big deal as long as humans are still around! If we met friendly-ish aliens, the person who spoke their language most fluently would get very rich.

There is a lot of wishcasting here. The AGIs will rapidly be doing lots of things no one can understand. Events will presumably be well out of our control. Yet being somewhat less completely confused, or getting completely confused slower, will be where it is at, and will pay meaningful dividends in real world outcomes?

This requires threading quite a few needles. Your expertise has to give you better understanding, despite the AGIs being able to explain things. That has to let you make better decisions. Your better decisions have to matter.

Even taking his metaphor at face value, are returns to being athletic higher? Yes, you can make quite a lot of money by being the very best. But you can be outrageously good at athletics, as in a minor league baseball player, and get very little return. Even trying for college scholarships is quite the sweepstakes. This is a winners-take-all (or at least most) competition.

Maxwell Tabarrok offers a takedown of Daron Acemoglu’s paper The Simple Macroeconomics of AI, another in the line of economic models that presumes AI will never gain any capabilities and current AI cannot be used except in certain specific ways, then concluded AI won’t increase economic growth or productivity much.

Anton points out that dumping massive context into systems like Claude Sonnet 3.5 is not going to dominate RAG because of cost considerations. Claude costs $3 per million input tokens, which is definitely ‘our price cheap’ but is still $187/GB, versus DDR4 at $2.44/GB, NVME at $0.09/GB. You will have an infinite context window but you will learn how not to use (and abuse) it.

If we do discover dangerous cyber capabilities in AI, what do we do next? Who finds out? The proposal here from Joe O’Brien is Coordinated Disclosure of Dual-Use Capabilities, with a government team funded and on standby to coordinate it. That way defenders can take concrete action in time. He and others make the same case here as well, that we need an early warning system.

It is hard to imagine, short of it being completely botched and useless, an early warning system being a bad use of funds.

What happened in Seoul last month?

Mostly: Diplomacy happened.

That makes it difficult to know whether things moved forward. In diplomacy (as I understand it) most time is spent establishing foundation and trust, laying groundwork for the final agreement. But always, always, always, when it comes to the bottom line, nothing is done until everything is done.

Still, this commitment goes beyond that and seems like an excellent start?

Dan Hendrycks (June 7, 2024): Last month in Seoul, major AI developers already committed to testing their models for risks, and even ceasing development if their models reach a catastrophic level.

It’s revealing how many people oppose regulation that would require companies to keep some of these promises.

Here are the commitments.

Outcome 1. Organisations effectively identify, assess and manage risks when developing and deploying their frontier Al models and systems. They will:

I. Assess the risks posed by their frontier models or systems across the Al lifecycle, including before deploying that model or system, and, as appropriate, before and during training. Risk assessments should consider model capabilities and the context in which they are developed and deployed, as well as the efficacy of implemented mitigations to reduce the risks associated with their foreseeable use and misuse. They should also consider results from internal and external evaluations as appropriate, such as by independent third-party evaluators, their home governments[footnote 2], and other bodies their governments deem appropriate.

II. Set out thresholds [footnote 3] at which severe risks posed by a model or system, unless adequately mitigated, would be deemed intolerable. Assess whether these thresholds have been breached, including monitoring how close a model or system is to such a breach. These thresholds should be defined with input from trusted actors, including organisations’ respective home governments as appropriate. They should align with relevant international agreements to which their home governments are party. They should also be accompanied by an explanation of how thresholds were decided upon, and by specific examples of situations where the models or systems would pose intolerable risk.

III. Articulate how risk mitigations will be identified and implemented to keep risks within defined thresholds, including safety and security-related risk mitigations such as modifying system behaviours and implementing robust security controls for unreleased model weights.

IV. Set out explicit processes they intend to follow if their model or system poses risks that meet or exceed the pre-defined thresholds. This includes processes to further develop and deploy their systems and models only if they assess that residual risks would stay below the thresholds. In the extreme, organisations commit not to develop or deploy a model or system at all, if mitigations cannot be applied to keep risks below the thresholds.

V. Continually invest in advancing their ability to implement commitments i-iv, including risk assessment and identification, thresholds definition, and mitigation effectiveness. This should include processes to assess and monitor the adequacy of mitigations, and identify additional mitigations as needed to ensure risks remain below the pre-defined thresholds. They will contribute to and take into account emerging best practice, international standards, and science on Al risk identification, assessment, and mitigation.

Outcome 2. Organisations are accountable for safely developing and deploying their frontier Al models and systems. They will:

VI. Adhere to the commitments outlined in I-V, including by developing and continuously reviewing internal accountability and governance frameworks and assigning roles, responsibilities and sufficient resources to do so.

Outcome 3. Organisations’ approaches to frontier Al safety are appropriately transparent to external actors, including governments. They will:

VII. Provide public transparency on the implementation of the above (I-VI), except insofar as doing so would increase risk or divulge sensitive commercial information to a degree disproportionate to the societal benefit. They should still share more detailed information which cannot be shared publicly with trusted actors, including their respective home governments or appointed body, as appropriate.

VIII. Explain how, if at all, external actors, such as governments, civil society, academics, and the public are involved in the process of assessing the risks of their Al models and systems, the adequacy of their safety framework (as described under I-VI), and their adherence to that framework.

  1. We define ‘frontier AI’ as highly capable general-purpose AI models or systems that can perform a wide variety of tasks and match or exceed the capabilities present in the most advanced models. References to AI models or systems in these commitments pertain to frontier AI models or systems only. 

  2. We define “home governments” as the government of the country in which the organisation is headquartered. 

  3. Thresholds can be defined using model capabilities, estimates of risk, implemented safeguards, deployment contexts and/or other relevant risk factors. It should be possible to assess whether thresholds have been breached. 

That is remarkably similar to SB 1047.

Markus Anderljung: This is just the start of this journey. Going forward, governments, civil society, academia, the public will need to be a part of defining and scrutinizing these frontier AI safety frameworks. But the first step is that they exist.

The thresholds would be set by the companies themselves. In the future, they should and probably will see significant input from others, including governments. They’d have to be public about it, which allows others to spot if their commitments aren’t sensible. Most of these companies don’t have these frameworks in place, let alone talk about them publicly, so this seems like a step in the right direction

In order to comply with this, you need to detail your safety protocols, which also means detailing what is being trained in at least a broad sense. You have to have procedures to verify your mitigations. You have to comply with shifting international standards and best practices that are not defined in advance.

The only substantial parts missing are the shutdown protocol and protecting the model weights until such time as they are intentionally released.

Also the thresholds are set by the companies rather than the governments. This seems worse for everyone, in the sense that a government standard offers safe harbor, whereas not having one opens the door to arbitrary declarations later.

So if this is so terrible, presumably companies would not sign… oh.

•Amazon

•Anthropic

• Cohere

•Google

• G42

• IBM

• Inflection Al

• Meta

• Microsoft

• Mistral Al

• Naver

• OpenAl

•Samsung Electronics

• Technology Innovation Institute

•xΑΙ

•Zhipu.ai

I am not saying that is ‘everyone’ but aside from some Chinese companies it is remarkably close to everyone who is anyone.

Ian Hogarth (Chair AISI): Really remarkable achievement announced at AI Seoul Summit today: leading companies spanning North America, Asia, Europe and Middle East agree safety commitments on development of AI.

If you scan the list of signatories you will see the list spans geographies, as well as approaches to developing AI – including champions of open and closed approaches to safe development of AI.

What else happened?

What about China’s statements? China would be key to making this work.

Matt Sheehan: Chinese readout from AI dialogue meets (low) expectations:

– want AI good not bad

– UN=leader on governance

Disappointing (but expected): China delegation led by Foreign Ministry North America bureau. Indicates China treating dialogue as aspect of US-China relations, not global tech risk.

Helen Toner: No Matt but didn’t you see, they agreed that AI could have big benefits but also poses big risks! I think that’s what they call a diplomatic breakthrough.

Saad Siddiqui: It feels like lots of different parts of the CN bureaucracy in the room, hard to imagine productive dialogue with so many different interests present across NDRC, CAC, MOST, MIIT, Central Committee Foreign Affairs Office. Any sense if that’s typical?

I do not know why anyone would have any hope for the United Nations. I worry that saying ‘the UN should take a leading role’ is a lot like saying ‘we should do nothing.’ Then again, if we already believe all five security council members have de facto vetoes over everything anyway, then does it change anything? I don’t know.

Imane Bello calls it a success, because:

  1. They got everyone together.

  2. They got China and America into the same room.

  3. There were calls for cooperation between many AI safety institutes.

  4. The interim international scientific report was unanimously welcomed.

  5. In Imane’s opinion, IISR is ‘history in the making.’

Again, that’s diplomacy. Did it matter? Hard to say.

UK lead negotiator Henry de Zoete is also calling it a win.

Jan Brauner sums up what they see as the most important outcomes.

  1. AI safety institutes say they will partner and share info.

  2. Companies make the commitments above.

  3. US AISI within NIST releases strategic vision (full version here).

  4. Soul Ministerial Statement is super explicit about existential risk.

  5. UK government sets up $11mm grant program for AI safety.

I looked over the NIST strategic vision. I have no particular objections to it, but neither does it involve much detail. It is a case of successfully not messing up.

Some have ambitious further plans.

Eva Behrens: Here are 5 policy recommendations for the upcoming AI Safety Summit in Seoul, from me and my colleagues at ICFG.

In Bletchley, world leaders discussed major risks of frontier AI development. In Seoul, they should agree on concrete next steps to address them.

Overview

In accordance with the shared intent communicated through the Bletchley Declaration to deepen international cooperation where necessary and mitigate catastrophic risks from advanced Al, we urge countries attending the Summit in South Korea to jointly recognise that:

  1. The development of so-called long-term planning agents (LTPAs) should be prohibited until proven safe,

  2. Advanced Al models trained on 10^25 Floating Point Operations (FLOP) of compute capacity or more should be considered high-risk and need to be regulated accordingly, and

  3. The open-sourcing of advanced Al models trained on 10^25 FLOP or more should be prohibited.

To build a strong foundation for international cooperation on the governance of high-risk advanced Al, we urge that Summit participants jointly agree to:

  1. Hold biannual international Al Safety Summits, and pick a host country to follow after France and

  2. Keep the focus of the Summits on international collaboration for mitigating catastrophic risks from advanced Al.

Contrast this with SB 1047. This would heavily regulate above 10^25 including full bans on open source (until a protocol is designed to allow this to happen safety, they say, no idea what that would be), with no adjustments over time. SB 1047 starts at 10^26, requires only reasonable assurance, and has a $100 million minimum such that the threshold will rapidly scale higher very soon.

Indeed, the ICFG says the threshold should over time be adjusted downwards, not upwards, due to algorithmic and hardware improvements.

This also proposes a ban on ‘long term planning agents,’ which unfortunately is not how any of this works. I don’t know how to allow short term planning agents, and effectively stop people from making long term ones. What would that mean in practice?

There was this talk that included Yoshua Bengio, Max Tegmark and Jaan Tallinn.

What about the full International Scientific Report on the Safety of Advanced AI? I looked briefly and I was disappointed. Over 95% of this report is the standard concerns about job displacements and deepfakes and privacy and other similar issues. The one section that does address ‘loss of control’ says experts disagree about whether this could be a concern in the future if we create things smarter than ourselves, so who can say.

They even say that a loss of control of highly capable AI systems is ‘not necessarily catastrophic.’ That is the only time the word ‘catastrophic’ is used, and they do not say ‘existential.’ ‘Extinction’ is only mentioned once, in the section directly after that, entitled ‘AI researchers have differing views on loss of control risks.’ Thus, despite the conference saying it should focus on existential dangers, this report is in effect highly dismissive of them, including implicitly treating the uncertainty as reason not throw up one’s hands and focus on issues like implicit bias.

Top AI labs are currently dramatically insecure. As the value of their model weights and other assets rises, both commercially and as an existential risk and matter of national security, this will increasingly become a problem. Alexander Wang, CEO of Scale AI, did a ChinaTalk interview in which he emphasized the need to lock down the labs if AI capabilities continue to advance.

Rand recently came out with an extensive report on how to secure model weights. As they note, securing only the model weights is a far more tractable problem than securing all the data and algorithms involved. They assume future frontier models will be larger, and online API access will need to be widespread.

Here is a Q&A with director Sella Nevo, one of the coathors, which goes over the most basic items.

What are their core recommendations?

They start with things that need to be done yesterday. The biggest dangers lie in the future, but our security now is woefully inadequate to the dangers that exist now.

Avoiding significant security gaps is highly challenging and requires comprehensive implementation of a broad set of security practices. However, we highlight several recommendations that should be urgent priorities for frontier AI organizations today. These recommendations are critical to model weight security, most are feasible to achieve within about a year given prioritization, and they are not yet comprehensively implemented in frontier AI organizations.

• Develop a security plan for a comprehensive threat model focused on preventing unauthorized access and theft of the model’s weights.

• Centralize all copies of weights to a limited number of access-controlled and monitored systems.

• Reduce the number of people authorized to access the weights.

• Harden interfaces for model access against weight exfiltration.

• Implement insider threat programs.

• Invest in defense-in-depth (multiple layers of security controls that provide redundancy in case some controls fail).

• Engage advanced third-party red-teaming that reasonably simulates relevant threat actors.

• Incorporate confidential computing to secure the weights during use and reduce the attack surface. (This measure is more challenging to implement than the others in this list but is backed by a strong consensus in industry.)

This is the least you could do if you cared about the security of model weights. Have an actual plan, limit access and attack surface, use red-teaming and defense in depth.

As Leopold noted, our goal must be to stay ahead of the threat curve.

The authors note that FBI Director Christopher Wray implied China had a workforce of more than 175,000 hackers. If China wanted to go full OC5+, they could. For now it would not make sense given the economic and diplomatic costs. Later, it will.

They also say North Korea invests ‘between 10% and 20% of the regime’s military budget’ in cyberwarfare, between $400 million and $800 million. I presume they do this largely because it is profitable for them.

Everyone acknowledges that an OC5-level attack on any major lab would almost certainly succeed. For now, that is fine. The question is, when does that become not fine, and where should we be right now? Should we be able to block an OC4 attack? I certainly hope we would be able to block an OC3 one given the value at stake.

We do not need to attempt bulletproof security until we are under robust attack and have assets that justify the real costs of attempting bulletproof security. We do need to be trying at all, and starting our preparations and groundwork now.

Longer term we will need things like this to have much chance, similar to what one would do if worried about model self-exfiltration, which we should be worried about in such scenarios as well:

• physical bandwidth limitations between devices or networks containing weights and the outside world

• development of hardware to secure model weights while providing an interface for inference, analogous to hardware security modules in the cryptographic domain

• setting up secure, completely isolated networks for training, research, and other more advanced interactions with weights.

They highlight 38 potential attack vectors in 9 categories.

How many resources are needed to launch various attacks? They have a table for that.

The numbers here are weird, representing chance of success linearly from <20% to >80%, against an arbitrary target. I would think things would scale differently.

I also do not think that ‘up to 20% chance of success’ is the right category? If something has a 10% chance of success it is a big deal.

Also important is that this is an enumeration of things we know about. That is a lower bound on the risk. The actual situation is far worse, because it includes unknown unknowns. It is very hard for the things we do not know about to be ‘good news’ here.

For multiple reasons, it is prudent to recognize the plausibility of current assessments underestimating the threat:

• We assume that other attack vectors exist that are as yet unknown to security experts, particularly ones concerning advanced persistent threats (APTs), such as state actors.

Novel attack vectors and conceptual approaches are likely to evolve over time, as are novel insights and infrastructure that make existing attacks more accessible.

• Publicly known examples of attacks are only a subset of attacks actually taking place, especially when it comes to more-advanced operations. Most APTs persist for years before discovery.

Many national security experts with whom we spoke mentioned that the vast majority of highly resourced state actor attacks they are aware of were never publicly revealed. This means that a purely empirical analysis based on detected operations would systematically underestimate the feasibility and frequency of advanced attack vectors.

Accordingly, one should expect capable actors to have access not only to well-established attack vectors but also to unknown approaches. In Appendix A, we share many examples of state actors developing such conceptually novel attacks years or decades before they were discovered by others.

Bold is mine. All of that involves human attack vectors only. If we include future AI attack vectors, enabled by future frontier models, the situation gets even more dire if we do not bring our new capabilities to play on defense with similar effectiveness.

Chapter 6 proposes that labs define security levels (SLs) from SL1 to SL5. If you are SL(X), you are protected against threats of OC level X.

So what does it take to get to even SL1?

In some senses this is easy. In others, in the context of a startup? It is asking a lot.

Moving to SL2 means ‘industry best practices’ across the board. Doing all of the standard things everyone says one should do is a standard few companies, in practice, actually meet. Almost everyone is doing some number of ‘stupid’ things in the form of not doing some of the things on this list.

What about SL3? It is essentially more of the same, only more so, and with serious worries about insider threat vectors. Any individual item on the list seems plausible but annoying. Doing all of them, in a world where your weakest point gets attacked, is not going to happen without a concerted effort.

SL4 gets expensive. Things are going to get slowed down. You do not want to be implementing this level of paranoia too early.

SL5 is that much more expensive to implement. You have to care quite a lot. Having eight security layers is quite the ask as are many other action items.

Is all that necessary? Would it even be sufficient? Consensus weakens as you move up to higher security levels.

There are deeper and more conceptual disagreements about what is needed to achieve the security implied by SL4 and SL5—with opinions ranging from the SL3 benchmark being sufficient to secure against all threat actors to claims that no system could ever present a significant hurdle to operations in the OC5 category.

A particular point of disagreement was the number of people who should have authorization to access the weights. Some experts strongly asserted that the model weights cannot be secure if this number is not aggressively reduced (e.g., to the low tens); others claimed that such a reduction would not be necessary, feasible, or justified.

I have definitely talked to an expert who thought that against an OC5 operation all you can hope to do is buy some time. You can prevent them from stealing everything the first day they set their sights on it, but protecting assets over time is, they claimed, rather hopeless. I haven’t seen credible claims that SL3-style procedures would be sufficient to protect against OC5, and I find that highly implausible, even if it has rarely if ever been tried.

The low tens seems to me quite a lot of people to have access to your core asset. I am not sure how different ‘low tens’ is from infinity. Certainly if your plan involves dozens of people each not being compromised, then you have no plan.

The second half of the report is details of the different attack vectors.

House appropriations bill cuts $100 million in funding for NIST. This is one of the worst things to be cutting right now, it is already woefully underfunded.

New paper on Risk Thresholds for Frontier AI. How should we combine compute thresholds, risk thresholds and capability thresholds? The conclusion is to primarily use capability thresholds but have them be informed by risk thresholds.

I am going to quote this in full because it feels like a good steelman of being skeptical about going too far too fast on regulation.

Seb Krier (Google DeepMind): I tend to think of AI policy in three consecutive phases: observation and monitoring; standardization and norm-setting; and then rules, law, and regulations if necessary. My impression is that in recent years some governance crowds have taken the reverse approach, motivated by the usual policymaker urgency of ‘we must do something now’. The problem with this is that you now have to define and cement very precise things that are still evolving, like evaluations and mitigations. Combined with the many trade-offs, inefficiencies, conflicting interests, low capacity, and frankly generally poor decision-making that governments currently suffer from, this often leads to messes, evidentiary gaps, legal risks, and rushed policymaking.

To be clear, I definitely think AI is a technology that will warrant some degree of regulation – and there may well be sector-specific uses or applications that warrant this now. I think cybersecurity-oriented regulations make more sense than omnibus regulatory behemoths. But at a more general level, I feel like we’re still in a phase where the value comes from research and finding things out. And I’d rather see 50 organizations developing evaluations and 5 advocating for regulations rather than the reverse (i.e. what we have today). This is also why I’m quite supportive of the experimental nature of institutions like the AI Safety Institute, where both sides iteratively learn as things progress.

Some people justify hasty policymaking because they think we will have AGI very soon and therefore this demands quick pre-emptive action, otherwise governments won’t have time to intervene. I think it’s right to try to pre-empt things, prepare institutions, and think ahead – but I don’t think timelines alone grant a carte blanche for any kind of legislation. Plus if we are indeed getting very close to AGI, I have 0 doubt that governments will inevitably wake up – and the implications, particularly for large risks, will be a lot more Leopold-like than creating a new GDPR for AI.

So essentially:

  1. For now we should observe and monitor, lay groundwork such as with NIST, and perhaps do select sector-specific interventions such as in cybersecurity.

  2. Later we will do, and will want to do various regulatory actions.

  3. But let’s try and push the key decisions forward in time so we learn more.

Also GPDR is deeply stupid law. Do not make laws like GPDR. They do great harm via creating frictions without accomplishing almost anything.

It is also correct to worry about regulatory lock-in. Not infinitely worried as in ‘anything imposed is automatically forever,’ but yes there is a lot of inertia and these things are hard to reverse.

How much do we need to worry about moving too slowly? That depends on:

  1. How long you think we have.

  2. How quickly you think we can move.

  3. How sensibly you think we would move in a crisis but with more information.

  4. Whether you think that by the time there is a crisis, it will be too late.

Reasonable people disagree on all those questions.

What most critics and skeptics fail to do is differentiate their responses to different types of regulatory proposals.

As in, is a proposal about observing and monitoring and allowing us to intervene when the time comes? Or is it attempting to intervene now on what people can do now, or dictate the form of intervention later?

Consider the response to something like SB 1047 or Biden’s executive order. Both are primarily about transparency, observation and monitoring of frontier models for the sole purpose of concerns on catastrophic or existential risks. They are deeply compatible with the perspective outlined here by Krier.

The logical response is suggesting improvements and discussing details, and talking price. Instead, most (not Krier!) who are skeptical of other forms of regulation choose for SB 1047 instead to hallucinate a different bill and different impacts, and for the executive order to demand it be repealed. They hallucinated so badly on SB 1047 that they demanded the removal of the limited duty exception, a free option that exclusively lightened the burden of the bill, and got their wish.

The logic of these others seems to be:

  1. You want to be able to observe and monitor, and prepare to act.

  2. If you did that, you might later act.

  3. Can’t have that. So we can’t let you observe or monitor.

SB 1047 has strong bipartisan public support (77%-13%), if this is how you ask about it. I notice that this is not exactly a neutral wording, although its claims are accurate.

This is unsurprising, although the margin is impressive. We have yet to see a poll on AI that doesn’t go this way.

The LA Times discusses SB 1047 and other proposed bills here. All the other bills seem actively counterproductive to me, especially the pure rent seeking demand from teamsters for supervision of self-driving trucks.

Dean Ball argues that SB 1047 is bad because it creates a government regulatory agency, via a fully general public choice counterargument against having government regulatory agencies for anything with broad positive use cases. I ended up discussing various SB 1047 things on Twitter a bit with him and Eli Dourado.

Politico covers that Y Combinator sent a letter opposing SB 1047. While the letter refreshingly say that the law was clearly drafted in good faith, all four of the letter listed concerns misstate the practical implications of the bill in alarmist terms. Then they say, rather than proposing fixes to particular issues, why not scrap the whole thing and instead encourage open source software? It is telling that such letters so often ask not only for no rules of any kind, but also for active government handouts and special treatment, despite SB 1047 already giving open source special treatment.

Dwarkesh Patel interviews Tony Blair, with AI as a major focus. Blair sees AI as the biggest change since the industrial revolution, the most important thing to focus on. He very much gives off the technocrat ‘this is how it all works’ vibe, without pretending that the technocrats are generally in charge or governments are competent. He sees AI will be huge but doesn’t seem to notice the existential risk angle. Essentially he is a sensible ‘AI skeptic,’ who does not expect AGI or a takeoff but sees AI would be transformative anyway. His focus has been ‘good governance’ so then he pulls out the standard good governance tropes. He also emphasizes that policy and politics (or ‘change makers’) are distinct things, and if you want to accomplish anything you have to be policy first.

Also has this great line from Blair: “The problem with government is not that it’s a conspiracy, either left-wing or right-wing. It’s a conspiracy for inertia.”

Interview with OpenAI board chairman Bret Taylor. He is excited for this generation of AI. His focus is clearly being CEO of Sierra, where he is building hopefully cool solutions for consumer brands, rather than his far more important role at OpenAI. That does at least mean he has lots of practical experience with current models. He holds his own on mundane job transitions but does not seem to be feeling the AGI. Instead he says, beware specific hype, but the economy will transform within 30 years and this will ‘meet the hype.’ Someone needs to have him talk to the technical staff. For now, it seems he does not grok existential risk because he doesn’t grok AGI.

Lester Holt interviews OpenAI CEO Sam Altman and AirBnB’s Brian Chesky, skip to about 35: 00, it is ~40 minutes. Often not new information dense. Colin Fraser notes some of the ways Altman is playing rhetorical slight of hand with the risks of AGI. If you expect to be able to tell an AI ‘go solve all of physics’ or ‘go create a great company’ then that is a completely transformed world, you cannot simply talk about ‘solving misuse’ as if misuse was a distinct magisteria.

When discussing events around Altman’s firing, Altman sticks to his story and lets Chesky tell a series of rather glaring whoppers. Both try to walk back the idea of an ‘AGI moment,’ there are only various capabilities in various areas, and try to deny that there is ‘a race’ in a meaningful sense. Altman follows the general theme of acting like everything will stay normal under AGI. I know he knows better. When he says ‘AGI could double the world’s GDP’ Holt points out this sounds outlandish, but I see it as outlandish on the downside and I think Altman knows that.

And he is making the ‘we have great ability to steer our current models and their values’ card, the real problem is choosing our values, which I see as a highly disingenuous attempt to dismiss alignment problems as being handled.

Mira Murati talks to Dartmouth Engineering where she is an alumni. It has some key spots but has low information density.

  1. She says we should expect to get ‘PhD-level intelligence for specific tasks’ in a year to 18 months. The usual suspects responded to this as saying no GPT-5 for over a year and did some gloating, which seems like the wrong response to this kind of prediction.

  2. She was broadly supportive of the government understanding what is going on and called for more of that.

  3. She says of the AI ‘it’s a tool, right’ and there is a subtle blackpill that she does not seem to notice that this might not be the full story in the future.

  4. It does seem she said ‘Some creative jobs maybe will go away due to AI, but maybe they shouldn’t have been there in the first place.’ Hot take. She then tried to save it on Twitter.

Roon (linking to this clip from this segment): I fing love Larry Summers.

Beff Jezos (responding to clip): So ing based holy .

Larry Summers introduces Bloomberg to the concept of recursive self-improvement, eventually using the term explicitly, and predicting transformative and seismic change. The issue, he says, is how do you manage that? He says we cannot leave AI only to AI developers. Public authorities must take a strong role in ensuring it gets used for good, but stopping it or slowing it down without thinking about positive developments would seed the field to the irresponsible and our adversaries, and he endorses ‘responsible iterative deployment.’

If this counts as highly based, where public authorities must take a strong role, and we should consider the positive benefits and also the downsides, perhaps we are getting somewhere. Lots of great stuff here, we need to now also work in alignment and the control problem, which did not get mentioned.

New interview with Anthropic CEO Dario Amodei. I haven’t listened yet.

Yuhal Noah Harai asks, among other things, what happens when finance becomes when zero humans understand the financial system? Would we end up controlled by an essentially alien intelligence? This specific mechanism is not that high on my list. The generalized version is reasonably high. Yes, of course, we will be under immense pressure to turn control over all of the things to AIs.

Leo Gao of OpenAI reminds us we do not know how neutral networks work. He does so in response to someone citing Leo Gao’s paper as evidence to the contrary that someone ‘must have missed.’ When the moment was described, he did not take it great.

This does seem to be accurate.

Agustin Lebron: No one:

Absolutely no one:

Every AI researcher: AGI is incredibly dangerous and no one should build it. Except ME. I can do it safely.

Eliezer Yudkowsky: Elon starts OpenAI because he doesn’t like Demis. OpenAI people repeatedly distrust OpenAI and leave to start their own companies… none of which trust *each other*… and one observes that they’re all founded by the sort of people who went to work for OpenAI in the first place.

Elon Musk: Actually, I like Demis. Just don’t trust the Google corporate blob.

Eliezer Yudkowsky: Apparently I’ve heard and told the wrong story all these years!

Reluctantly — because I do usually prefer to listen to people when they tell me what they actually said or thought, what with my not being a telepath — I feel obligated to mention that 3 different sources reached out to me to say, ‘No, Elon actually did dislike Demis.’

This puts me in an odd position and I’m not sure what I’ll say going forward. I am really reluctant to contradict people about what they themselves thought, but I also don’t want to represent a mixed state of evidence to the public as if it was a purer state of evidence.

An attempt to portray AGI existential risk as risk of domination. Would such a focus on such details convince people who are not otherwise convinced? My guess is some people do respond to such details, it makes things click, but it is hard to predict which people will respond well to which details.

I’m not going to lie and say it’s good. That doesn’t mean give up.

Alex Trembath: When I tell people I work in environmental policy, the most common response, BY FAR, is to ask me “How fucked are we?”

Kelsey Piper: People say this to me about climate and about AI. Guys, there are lots of serious challenges ahead but we are an inventive, wealthy, ambitious society with lots of brilliant hardworking people and all of our problems are solvable. We’re not doomed, we just have a big to-do list.

One reason I sincerely love Silicon Valley despite its deficits is that it’s the only place where I’ve run into strangers who will listen to a description of a serious problem they haven’t heard of before and go “huh.” [beat.] “What needs doing?”

Everyone who thinks you should obviously do [insane thing] is wrong. That is the easy realization. The hard part is: What is the sane thing?

Francois Fleuret: AGI happens in 3y, where should I invest my money?

Eliezer Yudkowsky: Everyone in the replies is saying “Guns and bullets” and I regret to inform everyone THAT WILL NOT ACTUALLY WORK.

There were a ton of replies to Fleuret. They did not contain original ideas. The most common were things like energy, Microsoft and Nvidia, which are a way to go out while having previously had more dollars to your name.

As many have long suspected about many accelerationists: The positions of Beff Jezos make a lot more sense if he simply does not believe in AGI.

Beff Jezos: ASI is a fairy tale.

Explain to me.

What the fis “ASI”.

FORMALLY.

Seriously. I’ll wait.

Mario Cannistra: Explains a lot.

Of course I’d want to accelerate if I didn’t think superintelligent AI was even possible.

We can safety consider the matter closed, then.

We now know why he named his new company xAI.

Elon Musk: The trend is very strong that any AI company’s name that can be inverted will be inverted.

Technology advances.

AI #70: A Beautiful Sonnet Read More »

childhood-and-education-roundup-#6:-college-edition

Childhood and Education Roundup #6: College Edition

Childhood roundup #5 excluded all developments around college. So this time around is all about issues related to college or graduate school, including admissions.

What went wrong with federal student loans? Exactly what you would expect when you don’t check who is a good credit risk. From a performance perspective, the federal government offered loans to often-unqualified students to attend poor-performing, low-value institutions. Those students then did not earn much and were often unable to repay the loans. The students are victims here too, as we told them to do it.

Alas, none of the proposed student loan solutions involve fixing the underlying issue. If you said ‘we are sorry we pushed these loans on students and rewarded programs and institutions that do not deserve it, and we are going to stop giving loans for those programs and institutions and offer help to the suffering former students, ideally passing some of those costs on to the institutions’ then I would understand that. Instead, our programs are moving dollars mostly to relatively rich people who can afford to pay, and by offering forgiveness we are making the underlying problems far worse rather than better. Completely unacceptable even if it were constitutional.

Colorado governor Jared Polis, who really ought to know better, signs bipartisan bill to make first two years of college free for students whose family income is under $90k/year at in-state public schools. Technically this is 65 credits not counting AP/IB, concurrent enrollment, military credit or credit for prior learning, so there is even more incentive to get such credits.

The good news is they do have a full cliff, this falls off as you approach $90k, so they dodged the full version of quit-your-job insanity. The obvious bad news is that this is effectively one hell of a tax increase.

The less obvious bad news is this is setting up a huge disaster. Think about what the student who actually needs this help will do. They will go to a local college for two years for free. If they do well, they’ll get to 65 credits.

Then the state will say ‘oops, time to pay tuition.’ And what happens now? Quite a lot of them will choose to, or be forced to, leave college and get a job.

This is a disaster for everyone. The benefits of college mostly accrue to those who finish. At least roughly 25% of your wage premium is the pure Sheepskin Effect for getting your degree. If you aren’t going to finish and were a marginal student to begin with (hence the not finishing), you are better off not going, even for free.

I do not think we should be in the business of providing universal free college. There are real costs involved, including the negative externalities involved in accelerating credentialism. However, if we do want to make this offer to help people not drown, we need to at least not stop it halfway across the stream.

The real life version of the college where there degree students who pay for a degree but aren’t allowed to come to class versus the non-degree students who get no degree but are educated for free. To be clear, this is totally awesome.

David Weekly: This seems kinda…radical? ASU makes its courses available to anyone for $25/course. After you take the class, if you want the grade you got added to an official transcript with a credit you can use, +$400. These are real college credits. 8 year olds are getting college credits!

Emmett Shear: This is cool to me because you can see the core of university economics right there. Bundling $25 worth of education with $400 of credentialist gatekeeping. I’m not blaming ASU, it’s cool they’re doing this, but that is deeply broken.

Sudowoodo: Totally understand your comment but this is the best possible instance of a college credit system I’ve seen. One course for $400 equals 120 credits of a degree for $16k (plus the $25 per course), or an additional major for just a few thousand dollars.

Emmett Shear: Right, but that just goes to highlight how absurdly overpriced the credentials are vs the actual education.

James Hulce: I did 70+ credits under this program. During the early years of the pandemic ASU reduced the credit conversion fee to $100 and waived the $25 enrollment fee, so I took a wide variety of courses. Overall very happy with the quality and delivery.

Aside from being virtual, this product is vastly better than the normal one. You get to try out courses for $25 and bail if they are no good. If you struggle, or you get bad grades, you can start over again for another $25 or bail. You are never stuck with a bad grade. Then at the end, after you pay for the credits, it is still a deep discount, an entire degree for $16k.

Of course, this is Arizona State University, so the real product (by reputation) is neither education nor credential. Rather it is the cool parties. This program cannot help you with those. But if you are cool enough and show up, they are also close to free.

The big picture is that trust in academia, like many American institutions, is rapidly collapsing, among essentially all groups.

Here is one theory on (one aspect of) what happened.

Derek Thompson: Why is trust in US institutions—esp colleges—collapsing? Here’s a theory. The 21st c has became the age of the unfocused institution—the age of mission inflation, goal ambiguity, and complex orgs losing any clear sense of priority, or identity.

Odalisk Flower: The university is supposed to solve the perennial question of the American Experiment: How do we get the benefits of an intellectual elite without the drawbacks of a hereditary aristocracy?

What has changed recently is common knowledge that this particular solution has failed.

In fact, it has failed so spectacularly that dissidents are now floating suggestions that perhaps a hereditary aristocracy isn’t so bad after all. For most, this is still outside the Overton window, but it’s wild how fast that window is moving.

I have not noticed rising whispers of the potential wisdom of hereditary aristocracy, indeed neoreaction seems to be fully dead. From where I sit, there is broad recognition that the universities and our other institutions have failed, without any particular suggestion about what plausible replacement would be superior beyond building private local alternatives. My expectations is that the replacement will emerge out of the transformations wrought by AI, whether or not it is an improvement.

Harvard students are highly stressed, despite having made it to Harvard, says Harvard Crimson. I would note that getting mental health counseling is often a function of how and when counseling is provided as much as it is about actual mental health – if we applied today’s standards to 2017 I bet the graph starts substantially higher.

Is this despite, or because, of the very high grades?

Article goes into the usual suspects, overscheduling, lack of social time, social media, hyper-competitiveness and perfectionism. Everyone running between ‘pre-professional’ activities trying to stand out. Harvard, the author says, is now a group of students obsessed with their relative status. Sounds like what would happen if you filter for exactly that type of young person, then put them all in the same place to compete, without the ability to differentiate themselves with grades because everyone who wants one has a 4.0.

Not that everyone in the Ivy league actually has a 4.0. Grade inflation is high, but these percentages of A grades from Yale are still a lot less than 100%, and inflation may have at least temporarily peaked:

The patterns here are clear, such that small surprises stand out and seem meaningful. Are we not appreciating what is happening in psychology? Their studies may not be replicating, but the grades are not either. You have to respect that. Whereas physics seems to have gone rather soft.

What does it say about the students who choose various majors and classes, given this wide distribution of grades? One could say that students going into education studies are smarter because they knew to secure better grades. Or one can say they went that way because they can’t hack it, or did not care to. Or one could say that your 4.0 in education studies means nothing (above getting into Yale in the first place) and everyone will know that.

Obviously we need a meaningful range of grades, otherwise students cannot differentiate themselves based on grades, so they both won’t care about doing well and learning, and they will become obsessed with other signals and status markers.

Ben Golub: This is real and is creeping outside Harvard to most elite private schools grades should be made to matter again, and instructor evaluation practices should be adjusted to give them a free hand to give bad grades!

Orin Kerr: Very interesting essay by Harvard undergrad @aden_barton, arguing that Harvard undergrads don’t spend a lot of time on classes and studying—which he attributes mostly to grade inflation. If grades are compressed around “A”, there isn’t much to study for.

Aden Batron (essay in Harvard Crimson): In the final class, each student was asked to cite their favorite readings, and the professor was surprised that so many chose readings from the first few units. That wasn’t because the students happened to be most interested in those classes’ material; rather, that was the brief period of the course when everyone actually did some of the readings.

Despite having barely engaged with the course material, we all received A’s. I don’t mean to blame the professors for our poor work ethic, but we certainly would have read more had our grades been at risk. At the time, we bemoaned our own lack of effort. By that point in the semester, though, many other commitments had started requiring more of us, so prioritizing curiosity for its own sake became difficult.

And therein lies the second reinforcing effect of grade inflation, which not only fails to punish substandard schoolwork but actively incentivizes it, as students often rely on extracurriculars to get ahead. Amanda Claybaugh, dean of undergraduate education, made this point in a recent New York Times interview, saying that “Students feel the need to distinguish themselves outside the classroom because they are essentially indistinguishable inside the classroom.”

How bad is it? Oh my lord.

Zalman Rothschild: I was a teaching fellow for two classes at Harvard College when I was at HLS. One was taught by an amazing visitor from Dartmouth. He enforced a strict curve. The other was taught by a Harvard prof. He informed us TFs that an A is the default grade. A- would require justification.

Orin Kerr: Jesus H.

Maggie Wittlin: No no, that’s the law school system.

Matt Yglesias says students in college should study more, and we should hold them to actual standards.

Right now, they are doing remarkably little real work.

When you add in-class education, homework, other educational activities and outside work (which I would say largely counts as educational and is often necessary for support), we get 5.1 hours for ‘full time’ college students, or 35.7 hours a week.

Matthew Yglesias: Philip Babcock and Mindy Marks have shown that over the decades, students have been spending less and less time on studying — “full-time students allocated 40 hours per week toward class and studying in 1961, whereas by 2003 they were investing about 27 hours per week.”

I agree with Yglesias that to fix this we would need a dramatic reversal of grading practices. You need willingness to actually punish students who are not getting it done, with actual life consequences on more than the margin, or it won’t work.

Matt Yglesias: The nascent Summers-era crackdown was turning A-s into B+s and B+s into Bs. That generated some whining from students, but ultimately, to restore old-school academic values, schools will need to hand out Cs and Ds that put students at the risk of real negative consequences, like loss of scholarships, getting kicked out of school, or heading into the job market looking like a real fuckup.

And then you get the problem that Hunham confronted: Is this what students and their parents want?

It is indeed not what most parents and students want. Which means we know what product they are mostly buying, and the universities are mostly selling.

And like so many other things these days, there is remarkably little product differentiation. Almost no one is willing to say, this is something different, and we will get those who want that different product, and employers or prospective citizens or what not who want that product can reward that. It is odd to me that this is rare. If all the selective universities are rejecting most applications, so what if 90% of students and parents recoil in horror, so long as the other 10% are excited? Or 98% and 2%?

The killing of Harvard’s Math 55. John Arnold contrasts an ‘06 Crimson article on how hard the course is, with a ‘23 Crimson article showing how it is no longer special. One can reasonably argue that if 70 start, 20 finish and only 10 understand, maybe that is bad actually, but I disagree. I think that math is a place for exactly that, because failure is an option. You want to provide the real thing, and it is fine if the majority can’t hack it and drop out. If we can’t fail here, where can we fail?

Claim that in the wake of their donors pulling out complaining about antisemitism, the price for Ivy league admission via donation has effectively been slashed on the order of 90%, from $20 million to $2 million. That seems clearly below the market clearing or profit maximizing price? The optics of doing large volume on this also seem pretty terrible. Kids whose parents can pay $20 million are someone you want as a peer so you can network, but at $2 million that advantage mostly fades. At some point the damage to the student body adds up. So I’m skeptical.

To what extent are we seeing a shift lowering the value of Ivy league degrees?

Nate Silver: This speaks to the story I wrote earlier this week. Yes, the value of your Ivy League degree is going to be affected if people start to associate your school with political activism instead of academic rigor.

Andrew Ross Sorkin (NYT): Businesses may be unlikely to rush into formally patrolling universities’ policies by adopting either of these theoretical maneuvers, but they might amp up the pressure in some other way through their informal preferences. As Darren Woods, the chief executive of Exxon Mobil, said of campus protests in an interview with CNBC this week: “If that action or those protests reflect the values of the campuses where they’re doing it, we wouldn’t be interested in recruiting students from those campuses.”

John Arnold: Anecdotal, but I’ve had several conversations in recent years with people who hire undergrads for highly competitive jobs (tech, finance, consulting etc) that are moving away from the Ivies and towards flagship state universities, citing better cultural and professional fit.

Now confirmed with data. Forbes surveyed managers with hiring authority. When asked whether more/less likely to hire vs 5 years ago:

Ivy League: 7% more likely; 33% say less likely

Public univs: 42% more likely; 5% less likely

Selective privates: 37% more likely; 5% less likely

I would classify the selective privates at least half with the Ivies, not mostly with the public universities, if I was doing this style of recruitment.

Preston Cooper provides an entry in the genre where you measure the financial ROI of various college degrees given different universities and majors. 31% of degrees were negative ROI, once you factor in time costs and risk of not finishing.

Every time we run this test we get a graph of majors that looks like this:

That then interacts with different colleges, which differ in many ways including completion rates. And of course, if you switch programs based on this information, you do not suddenly get the completion rate (or net life impacts) of the degree you switch to, even if the original study was done fully correctly.

The return on master’s degrees was not so great.

Preston Cooper: What about grad school? It’s complicated.

Med school & law school have huge payoffs.

But nearly half of master’s degree programs leave students in the red.

How much government funding goes to programs with no return? We can answer that thanks to new data.

Programs in the ROI database received $418bn in funding from 2018 to 2022.

Of that, $122bn (29%) flowed to negative-ROI programs.

It would be highly reasonable to tie government funding to program ROI, if we had a good measurement of that, but that is not how our government works.

Here is the data dashboard. In which I learned that my degree and major had negative ROI by this metric, whereas if I had switched majors from Mathematics to Economics like I considered, I would have had a vastly easier job all around and also picked up almost three million dollars (!) in expected value.

I don’t buy the full result there, but if this reflects reality even somewhat, letting me make this mistake and stick with Mathematics, without even a warning, was deeply, deeply irresponsible.

Ideally we would get a more detailed breakdown, but yes.

Derek Thompson: Before the pandemic, new england colleges had more than 2x more applicants than southwestern colleges.

At current trajectories, southwestern college applicants will surpass new england in two years.

Nate Silver: This is pretty interesting in light of yesterday’s post.

There’s an inverse correlation between the left-wingness of the colleges in each region and growth in applications.

A lot of students just want to go to college to drink beer, hook up, go to football games, and emerge with a degree that will give them gainful employment. They far, far outnumber the political activist types. And they’re voting with their feet, it looks like.

Most students care primarily about things other than political activism. The problem for them is that college is a package deal. (Almost?) all the selective colleges have lots of political activism and force you to care deeply about things that are neither fun nor going to be useful to your future or part of getting a traditional education. And at least faking those things is deeply tied to your admission to those schools and to your social life and experience in class and administrative rules set once you arrive.

Colleges are reversing course, and admitting that yes standardized test scores are required for admissions.

It was completely insane to drop this requirement. Doing so only hurt the exact people they claimed to be trying to help. The good news is, while we have a long way to go, we seem to be past peak insanity in such matters.

Nate Silver: The critique that universities are run like for-profit corporations that are mostly concerned about the bottom line is correct. Also, that’s what might save them.

A new way has been found to discriminate.

Steve Miller: UCSD announced a new policy April 9 to exclude students whose parent is college educated and makes over $45,000 from enrolling in computer science or other selective majors, unless spots are available after first generation or low income students enroll.

Nearly 40% of all UC San Diego students are first generation students.

So if you are not a first generation student or low income, it will likely become virtually impossible to enroll in computer science or other selective majors.

This policy applies to students seeking to enroll in selective majors after their initial admission to the university, as the policy linked in the original post specifies.

Separate preferences for first generation students apply in admission.

This likely effectively means that if you are not a first-generation college student (and an in-state student) then you will not be able to transfer to a selective major, no matter your other GPA. Those making these decisions have made their motivations and intentions clear, so go in with your eyes open, both reading the fine print and realizing that they could add more fine print later.

But also, it seems odd that students want to major in computer science, and we are saying no rather than expanding the program? Isn’t that exactly what we want?

Perhaps our children are learning after all. They can solve for the equilibrium.

That was back in 2021. Presumably this number has only gone up since then.

The Hill:

  1. A survey found that 34 percent of white students who applied to colleges falsely claimed they were a racial minority on their application.

  2. Most students, 48 percent, claimed to be Native American on their application.

  3. Seventy-seven percent of white applicants who lied about their race on their application were accepted to those colleges.

According to Intelligent.com Managing Editor Kristen Scatton, the prevalence of applicants who claim Native American ancestry is possibly due to the popular narrative that for many Americans, a small percentage of their DNA comes from a Native American tribe.

It is not clear this is helping the applicants much, whether or not they were caught. Liars got accepted at a 77% clip, but the typical acceptance rate overall is already about 65%, and it is not clear this is ‘accepted at any given college’ rather than at all, and there are various other factors in both directions.

What’s totally crazy is doing the math on this.

  1. About 50% of college applications are from white students.

  2. White students report they lied 34% of the time.

  3. Of those students, 48% pretended to be Native American.

  4. That means that 5.8% of applications are falsely claiming to be Native American.

But the rate of real Native American applications is only about 1%. So that means, even if the other half of applications never lie: If you mark Native American, there is an 85% change you are lying. Meanwhile, several percent of those who lied checked the box for AAPI, which presumably only hurts your chances even if they believe you.

So yes, I doubt checking that box helps you much on its own.

Phil Magness: If you want to genuinely disrupt higher education for the better, impose severe limits on the number of mandatory GenEd classes that students must take. These courses are the lifeblood of hyper-politicized woke departments that otherwise wouldn’t attract many students.

Most students would be better served by starting their majors earlier and taking more classes in skills and subjects related to their degrees. Most GenEds, as currently taught, are complete wastes of time at best, and political indoctrination at worst.

Those same GenEds serve another function though: they create jobs for faculty in otherwise unpopular disciplines. And the depts that have the heaviest presence on the GenEd curriculum (e.g. English) also tend to be the largest departments on campus, despite drawing few majors.

It’s unethical to require economically precarious 18-21 year olds to pay for classes they don’t need just to keep a horde of English, Sociology, and Foreign Language professors employed.

I had a highly extensive set of general required courses I had to take, something like 40 credits. You could make a reasonable case for the 16 that were reading the ‘Great Works’ of literature and philosophy. There wasn’t a problem with wokeness back then (the closest thing was when I sort of tried to cancel The Symposium for all the praising of child rape, and got told to STFU about it and come to class or else), but still the rest was pointless, a waste of time taking up almost a full year of coursework.

Phil Magness notes that students could instead start their majors. That implies that when you arrive on campus, you should know what major is right for you.

That is another issue with all the required classes. There is little room for exploration, most of those slots are already spoken for. If I had wanted to switch majors to something other than Mathematics, I had almost no opportunities to sample alternatives in time to do this. Realistically I could have probably made it to Physics or Economics, and that’s about it.

Which majors are most often regretted? Humanities.

Jacob Shell: What they don’t tell you in high school or college advisor offices is some of these are “winner takes all” majors and others aren’t. The comp sci normie is making a nice living right now, but the physics major is a sunlight-deprived lab tech for 30 years in a row.

I would have thought the physics majors were mostly not now doing physics? It still makes sense that regret rates are high. Math majors are mostly not doing math all day anymore, but they seem fine with it. As a math major myself, I am an exception, and I do regret it, although perhaps the signaling value made it worthwhile after all.

Here is a different survey that asks the same question. Will you regret that major? This time the answer is, probably.

Regret is an imprecise measure, but these are not small differences.

Thread of what Basil Halperin learned in graduate school. Increasing returns to effort for specialization in terms of skills, whether that translates to world improvement or pay is another question. Nothing here made me think anyone should go to grad school.

Then again, do you go there for the learning?

Here is Bryan Caplan on when to get which Econ PhD. The algorithm is essentially:

  1. Only get an economics PhD at all if you want a job that needs it, such as an economics professor.

  2. If you can get into a top-25 Econ program and endure the pain, go there instead.

  3. If you can’t do either or both of those, you can go with GMU.

  4. When to get a Masters? When you drop out before finishing your PhD.

  5. In any case, if you want this, apply to at least 15 schools, process is super random.

It is no surprise given his other opinions that Bryan Caplan’s answer to that question is a very sharp no. In Caplan’s model the purpose of graduate school is to get a job that won’t hire you without one. That is it.

I think he’s right.

Nate Silver offers related Good Advice.

Nate Silver: Real, non-trollish life advice:

If you’re a smart young person and you really want to go to graduate school, then by all means go. But if you’re on the fence, probably don’t. That’s not where the action is. And it’s not where the action is going to be for the foreseeable future.

The specific exception is if you go to graduate school with the intention of being a sleeper agent to improve academic (or government/broader nonprofit research) culture. That is potentially quite valuable for society (though it won’t necessarily be lucrative for you personally).

I do not believe you when you say you are going to be a Sleeper Agent. I expect you to either get worn down and be a normal academic, or to run away screaming in horror at some point, because man does that all sound miserable. It is a noble thing to do, of course, to be the change you want to see and fight for it, if you can.

It emphasizes that my basic advice here would be that going to graduate school is something you should only do with a very specific purpose, and generally only if you can attend an elite institution. Do not go because you have nothing better to do. Have a specific career path in mind, that either does not face or justifies the long odds usually against such paths. Know what you want to learn, and want to prove.

Or, ideally, if you possibly can, go do something else instead.

What is academia for, then? Presumably something else.

Aella: It’s insane how much academia is not about figuring stuff out. The current state of academia is not what it would look like if we went “hey I wanna figure out the truth behind a thing.”

Fred Scharmen (QTing): “Hey I wanna figure out the truth behind a thing” is like what an elementary schooler thinks that grad students are supposed to be doing. I hope this person grows up eventually.

Hazard: Good example of the general vibes and tactics used to haze people into fucked up social orders and institutions without ever having to defend them.

You just mock people who don’t know the scam is a scam.

I’ve written about this before.

It’s a load bearing tactic for maintaining normalization of deviance.

It is worse than that.

I get mocking someone for actually being confused here. One should not do even that. But yeah, if someone with experience straight up said ‘I am shocked, shocked to fund that things other than searching for truth are going on in here, how can that be, I am so confused’ then then mockers gonna mock.

This is not someone saying ‘I do not understand why someone is slurring their words in this cafe’ in a world where the cafes were called cafes but were actually bars. This is ‘it really is insane the amount of hard drinking going on in all the cafes, did you notice how rare it is for anyone to get a coffee anymore, they are actually bars’ and someone mocking you, saying ‘coffee is what an elementary schooler thinks people drink at cafes.’ And then everyone went back to pretending cafes only served coffee.

As Dilan Esper and Andrew Rettek note here, the right thing on free speech is to defend everyone’s right to speak. It is in the context of very much not doing this in other contexts, treating a wide variety of far less harmful speech as ‘violence,’ that this sudden claim of realizing of one’s principles in this one case rung hollow. No one is pretending this is a new set of general pro-speech principles to be universally applied.

As Jill Filipovic and Jonathan Haidt each note, it would be great if universities used the recent protest moment to realize they their systematic error, and broadly once again embrace free speech the way they used to do.

This is the letter the ACLU sent out in 1978 after they defended the right of actual Nazis to march in order to defend free speech for all of us.

You have to let them talk. This is America, man. Or at least, it used to be.

Alas, I am not holding my breath for such an outcome.

If it does happen, Charles Murray has kindly offered to allow the presidents to prove their devotion to free speech by letting him host a talk.

We are not letting them talk. FIRE found that 3% of current college students have been punished for speech, which translates to 5% over four years, which is enough for a hell of a chilling effect especially given how risk averse college students are now.

Jill Filipovic urges us all to rise to the standard of the old ACLU, no matter what others have done, and stand firm for free speech even asymmetrically. Do not call, she says, for more restrictions in the name of even-handedness. That is a tough sell. It is also not obvious which path leads to more free speech. Si vis pacem, para bellum?

Larry Summers points out that Harvard’s multiple Antisemitism Taskforces, which are accomplishing nothing, are the wrong approach, an alternative to both moral leadership and standing up strongly for free speech. Instead, Harvard continues to allow official support antisemitic positions without allowing the voicing of pro-Israel positions.

Paul Graham links to Richard Florida, a professor at the University of Toronto, who says people in academia now feel more space to speak their minds after recent events.

Here are some examples of other cases where free speech could have been stood up for, and universities chose a rather different path.

Harvard declares it is now mission first. It will no longer make ‘official statements about public matters that do not directly affect the university’s core function.’ I put up a prediction market on whether they stick to it. Good luck, Harvard!

What is Harvard’s mission? Harvard.

Nate Silver: Notable exceptions to free speech:

Incitement

Defamation

Criticizing Harvard

Lawrence Bobo (Dean of Social Sciences, Harvard): A faculty member’s right to free speech does not amount to a blank check to engage in behaviors that plainly incite external actors – be it the media, alumni, donors, federal agencies, or the government to intervene in Harvard’s affairs.

Lawrence Summers: It takes something extraordinary to bring me into agreement with Israel demonizing faculty like Walter Johnson. That is what Harvard Dean Lawrence Bobo has done with his call for punishing faculty who publicly challenge university decisions.

I cannot understand why his boss Dean Hopi Hoekstra has not condemned the idea. Nor can I understand how someone who believes faculty who believes in punishing dissent can be allowed to set faculty salaries, decide on promotions or be involved in faculty discipline.

How can it be according to Harvard leaders that it is fine to call for an end to Israel as a Jewish state but not to criticize the University administration?

Students from the University of Waterloo computer science programs have been enjoying oversized success, despite it being a relatively young university founded in 1957. Henry Dashwood looks at what makes Waterloo different. They have a five year program that does not break for the summer, the culture focuses on working on projects rather than partying or sports, they have a startup accelerator on campus, and despite having a lot of CS students they are very selective (claimed 4% acceptance rate).

So it is exactly the story one would expect based on what startup culture says. Focus on building things, cut out everything else.

I am curious if that model will long survive moves like this, although I appreciate that they have a distinct department for pure mathematics:

Chris Brunet: The Department of Pure Mathematics at @UWaterloo is hiring a math professor.

”Eligible candidates for this search must self-identify as women, transgender, gender-fluid, nonbinary and Two-Spirit people.”

Waterloo’s Faculty of Engineering is also hiring an engineering professor.

”Eligible candidates are required to identify as a woman or gender minority, which is defined to include individuals who self-identify as women, transgender, gender-fluid, non-binary, or twospirited.”

Also, 2 professors of computer science.

Joshua Rauh nots that his training on DEI included an example of where someone saying ‘DEI has gone too far’ is the first sign of prejudice and on the job discrimination.

Alex Tabarrok in response: DEI has gone too far.

Indiana signs a bill introducing ‘intellectual diversity’ as a standard for tenure decisions. Tyler Cowen suggests it will backfire, that observance will be addressed via technical box-checking, and that universities could retaliate by not hiring any actual conservatives (even more than they already do) at all for fear they would be forced to grant such people tenure later. It is extremely difficult to get a bunch of academics who want it to be one way, with only left-wing (or often only far-left-wing) viewpoints welcome in academia, to agree to have it be the other way via a law. Tyler does not lay out what he would do instead. I can think of ways to do it, but they involve big guns.

Wisconsin’s universities initially voted down a compromise to get rid of some DEI positions in exchange for funding for raises and new buildings, but they came around.

Washington Post Editorial Board comes out against DEI statements in hiring.

WaPo Editorial Board: The last thing academia — or the country — needs is another incentive for people to be insincere or dishonest. The very purpose of the university is to encourage a free exchange of ideas, seek the truth wherever it may lead, and to elevate intellectual curiosity and openness among both faculty and students. Whatever their original intent, the use of DEI statements has too often resulted in self-censorship and ideological policing.

Here is what they are opposing.

Paul Graham: People in the sciences thought they could ignore the fools over in the humanities and just focus on their research. But now the fools’ ideology is colonizing the sciences.

John Sailer: NEW: Yale University’s department of molecular biophysics and biochemistry requires all job applicants to submit a DEI statement.

Here’s the evaluation rubric, which shows the exhaustive DEI criteria for assessing any scientist hoping to work in the Yale department.

Here is the full post from The Free Press.

When making hires at Yale’s department of molecular biophysics and biochemistry, faculty are told to place “DEI at the center of every decision,” according to a document tucked away on its website

To what extent does that mean an applicant’s DEI score impacts their chance of being hired? If you have a 12 versus an 8 versus a 0, what happens? One cannot be sure. It is compatible with ‘anyone under 11 need not apply’ and also with ‘no one actually cares.’

How easy is a high score? My guess is you can get to about a 7 (3/2/1/1) with a willingness to bullshit and use ChatGPT. Higher than that likely requires either lying or being willing to spend (and commit to spending) substantial amounts of time.

What about Columbia? How much do they care? What do they want?

John Sailer: NEW: For hiring new professors, Columbia University recommends valuing “contributions to DEI” on par with “research.”

The sample evaluation tool also weighs DEI more highly than teaching.

That’s an especially wild default given how Columbia defines “contributions to DEI”

Columbia provides an in-depth rubric for assessing DEI credentials. Which, of course, is pretty important if DEI might carry the same weight as research. Take a look. The rubric gives a low score to candidates who are skeptical of racially-segregated “affinity groups.”

You can feel the attitude coming off these rubrics.

This looks like a substantially tougher test to handle if you mainly care about your subject or are trying to muddle through without a huge time sink or ethical compromise. They mean business.

Given how numerical scores usually work, you do not have much margin for error. Getting a 15 here, if you are willing to do what it takes and spend the time, is easy, and probably so is getting a 9-10 in ‘service’ and that is probably highly linked. I doubt they have that high a bar to get to 8+ on teaching, and a 10 might be pretty easy there too. That does not leave much room to make up points, which has to be done with research. And a third of that is ‘curricular fit’ so those who are gaming the system are going to get full credit there too, while plans are pretty easy to fake.

Your entire actual ‘research track record’ is only worth five points. So yeah, if you are not heavy DEI for real, good luck. You’re not going to make it here.

Harvard’s Faculty of Arts and Sciences eliminated the requirement for DEI statements in hiring (source). Instead they are asked to submit a ‘service statement,’ which can include DEI if you want that. As an applicant, you now must ask: Do you think the requirement went away, or that they are testing you to see if you realize that it didn’t?

One must ask, what exactly did Sally Kornbluth believe before?

John Sailer: BREAKING: A university spokesperson has officially confirmed to me that MIT will no longer use diversity statements in faculty hiring—making it the first elite private institution to backtrack on the controversial policy.

As recently as late 2023, MIT required prospective nuclear scientists to submit “a statement regarding their views on diversity, inclusion, and belonging.” No longer. In a statement provided to me by MIT, Sally Kornbluth said these statements “impinge on freedom of expression, and they don’t work.”

Was she unable to get rid of the statements until now?

Did she think they both worked and that they didn’t impinge on freedom of expression? I can see one thinking that perhaps they work. I can’t see how one can claim they don’t impinge on freedom of expression. You either care about that, or you don’t. So, revealed preferences on priorities, then?

NYU opening a new campus in… Tulsa? Seems like an excellent source of diversity.

Childhood and Education Roundup #6: College Edition Read More »

julian-assange-to-plead-guilty-but-is-going-home-after-long-extradition-fight

Julian Assange to plead guilty but is going home after long extradition fight

Plea deal —

“Julian is free!” wife wrote after Assange struck deal with US government.

Julian Assange in an airplane seat, looking out the window.

Enlarge / Julian Assange in an airplane in a photo posted by WikiLeaks on June 25, 2024.

WikiLeaks founder Julian Assange has agreed to plead guilty to a single criminal charge, ending a long extradition battle with the United States government. Assange will reportedly avoid further jail time and be allowed to return to his home country of Australia.

Assange won’t have to travel to the continental United States. He is scheduled to plead guilty tomorrow in US District Court for the Northern Mariana Islands, a US territory in the western Pacific Ocean.

In a court filing in Saipan, the US government said:

We appreciate the Court accommodating these plea and sentencing proceedings on a single day at the joint request of the parties, in light of the defendant’s opposition to traveling to the continental United States to enter his guilty plea and the proximity of this federal US District Court to the defendant’s country of citizenship, Australia, to which we expect he will return at the conclusion of the proceedings.

During the Wednesday hearing, “we anticipate that the defendant will plead guilty to the charge in the Information of conspiring to unlawfully obtain and disseminate classified information relating to the national defense of the United States, in violation of 18 U.S.C. § 793(g), and be sentenced by the Court for that offense,” the US said.

Assange on a plane

Assange was flying to Saipan today, according to his wife, Stella Assange. “Saipan is a remote US overseas territory. He will be entering the United States. Julian won’t be safe until he lands in Australia,” she wrote.

Stella Assange wrote in an earlier post that “Julian is free!!!!” and thanked his supporters. She also announced a fundraising campaign to cover $520,000 “which he is obligated to pay back to the Australian government,” saying that he “was not permitted to fly commercial airlines or routes to Saipan and onward to Australia.”

The US unsealed a 2018 indictment against Assange in 2019, right after British police arrested him on behalf of US authorities. Assange went into hiding in the Ecuadorian Embassy in London in 2012, but the Ecuadorian government revoked his asylum after seven years.

The New York Times reported that Assange “is expected to be sentenced to about five years, the equivalent of the time he has already served in Britain.” The NYT cited a law enforcement official who is familiar with the terms of the deal.

Failed extradition attempts

In 2010, Assange’s WikiLeaks released classified documents leaked by Chelsea Manning. As Bloomberg wrote yesterday, “Assange was charged with encouraging and assisting Manning in obtaining around 750,000 classified or sensitive documents, one of the largest leaks of state secrets in US history. The original charges—17 related to espionage and one to computer misuse—carried a maximum penalty of 175 years in prison if he was found guilty on all counts in the US, although sentences for federal crimes are typically less than that.”

In 2021, a British judge rejected the US government’s request to extradite Assange, saying that he would be at greater risk of suicide in the American prison system. The US won an appeal of that ruling but legal proceedings continued. In March 2024, Assange was granted another reprieve by the High Court in London.

“Negotiations toward a plea agreement heated up in recent months after US President Joe Biden said he was considering a request from the Australian government to strike a deal that would allow Assange to return home,” Bloomberg wrote.

Stella Assange said she will seek a pardon for her husband after his guilty plea. “The fact that there is a guilty plea under the Espionage Act in relation to obtaining and disclosing national defense information is obviously a very serious concern for journalists and national security journalists in general,” she said, according to Reuters.

Australian Prime Minister Anthony Albanese wrote, “The Australian Government has consistently said that Mr. Assange’s case has dragged on for too long and that there is nothing to be gained by his continued incarceration. We want him brought home to Australia.”

Julian Assange to plead guilty but is going home after long extradition fight Read More »

monthly-roundup-#19:-june-2024

Monthly Roundup #19: June 2024

Looks like we made it. Yes, the non-AI world still exists.

New York Governor Kathy Hochul has gone rogue and betrayed New York City, also humanity, declaring a halt to congestion pricing a month before it was to go into effect. Her explanation was that she spoke to workers at three Manhattan diners who were worried people would be unable to drive to them from New Jersey. Which, as Cathy Reilly points out, is rather insulting to New Jersey, and also completely absurd. Who in the world was going to go into Manhattan for a diner?

She says this won’t interfere with Subway work. Work on the 2nd Avenue Subway line has already been halted. And that’s not all.

You’re damn right. We are going to blame Hochul. Every. Damn. Time.

So Elizabeth Kim investigated. One never talked politics at all. One is directly across from Grand Central, is not a diner, and actively wants congestion pricing. The third did in fact object. That’s it. The good news is Hochul’s attempt to prevent this seems likely to be illegal, so maybe it won’t stop us.

The good news is this was so dumb that she might get primaried, but we will have to wait until 2026.

This terrible thinking is not an isolated incident.

Governor Kathy Hochul (D-NY): The next few days it’s going to be hotter than hell across New York — so we’re making admission and parking free at all our State Parks, pools, and beaches tomorrow and Thursday!

Take your families to beat the heat, and enjoy it on us ☀️🌊

Matthew Yglesias: When faced with high demand for an excludable, rivalrous good with inelastic supply, I would make the price higher rather than lower and use the revenue for something useful.

Trump endorsed high skill immigration explicitly on the All-In podcast. He even said, only half prompted, that anyone who graduates from even a junior college should get a green card to stay in the country. It is amazing how clearly he speaks here. There is little question that Trump ‘gets it.’

Yet Trump’s track record is of course very different. Remember Trump’s H-1B Visa suspension in 2020?

So I wrote I was not optimistic about Trump following through, and indeed he has already ‘walked this back.’ Notice Fox News saying this was somehow a promise about ‘migrants.’

We should still obviously take this all up immediately in a bill and see who votes for it.

High skill immigration is overwhelmingly popular across the board, but political gamesmanship has meant we don’t have it. Shame on everyone. Fix it.

No, I don’t care that this isn’t being tied to other things you also like. FIX IT.

There is of course a potential problem with the equilibrium.

Austen Allred: I love the idea of letting more skilled labor into the United States (and making it easier to stay), I just want to make sure we realize “everyone who gets a degree gets a green card” would be mostly driven by diploma mills.

Mark Krikorian (Center for Immigration Studies, Executive Director): If someone earns a Ph.D. at a university in a hard science, I personally will drive to their house and give them a green card. The issue is any foreign college graduate, even from a bogus two-year master’s program or gender studies [major], would get a green card.

Trump explicitly included even junior colleges. Which would absolutely mean this gets dominated in terms of number of students by diploma mills, especially once that opportunity is established.

You know what? I say that’s fine, if that is what it takes. The top people matter a lot, and if you get a bunch of other young people who clear even a moderate bar, that is going to be good too. It’s not even clear raising standards would be better.

We could do something that better addresses everyone’s concerns by being narrower, and I would be happy to start there if that is what it takes. But of course Trump did not walk this back to ‘we need to limit this to real degrees from real schools in real things’ or anything like that. He went back to his anti-immigration rhetoric, full stop, as if this never happened.

Salad Size Me, eating only Sweetgreen for two weeks, goes as you would expect. The shorter duration (versus the original Super Size Me) was initially based on cost considerations, but being able to stop after two weeks was priceless.

Any time you think people know things they have no practical need to know, remember that only 1% of college students know that Hannibal was from Carthage.

Isaac King: This seems like a common failure mode in knowledge-based hobbies. People pour a ton of effort into learning the details of their field, giving it personal importance to them, and they incorrectly generalize this to a belief that their obscure trivia is of general importance.

I’m never sure whether I’m doing this. When I encounter someone who doesn’t understand some basic-seeming-to-me math or science concept, is that actually a real problem, or just me ascribing undue import to something that happens to interest me?

Women, the young and the left leaning in academia are more censorious than their counterparts, and more likely to discourage various forms of research. Cory Clark reports about 10 ‘taboo claims.’

So of course Robin Hanson offered polls on these so-called taboo topics. The ‘controversial’ positions got overwhelming support. The tenth question, whether demographic diversity (race, gender) in the workplace often leads to worse performance got affirmed 54%-17%, and the rest were a lot less close than that. Three were roughly 90%-1%. I realize Hanson has unusual followers, but the ‘taboo questions’ academics want to discuss? People largely agree on the answers, and the academics have decided saying that answer out loud is not permitted.

Cocoa prices are dangerously high and might take years to come down. Worth it.

Disney started giving its rides long official names rather than using casual nicknames people would actually use, forcing influencers to use the real names. Which means you know they’re paid and they sound like a duffis.

You can buy vapes on which you can play Pac Man. Our watching out for the children principle is, shall we say, inconsistent.

Stadium tours doing poorly, many of them being cancelled. The upside profits are huge, and touring a ton is a very non-free action, so perhaps this is the equilibrium. If you are not failing at a large fraction of your stadium tours, you are not attempting enough stadium tours. My experience however is that you get rapidly decreasing marginal utility from going to bigger events. When I went to Radio City Music Hall to see Taylor Tomlinson’s Have it All tour, I had a solid seat and a great time, but I had to force me eyes to look at the physical Taylor rather than the giant screens of her. I’d pay substantially more to go to the smaller Beacon Theater, although I’m sure it would still add up to a lot less.

Prediction markets are unpopular. Sure, lots of people in my circles love them and want there to be more of them, but activity is limited even when you get them, and usually focused on stuff not that interesting. The basic thesis here from Nick Whitaker is that without subsidies no one wants to trade, so you need subsidy in the form of either cash, natural hedgers or suckers at the table, and usually you have none of them, nor do you appeal to investors trying to make a buck, and being slow to resolve is a huge issue.

This is all broadly compatible with my perspective from a while back. I strongly agree that you need subsidy if you want to get good action. Alas, people are mostly unwilling to pay. I think we basically need to ‘suck it up’ and be willing to pay for information, both to subsidize traders and encourage reliable wording and resolution.

As I’ve tried to use Manifold, my biggest frustration has been resolution criteria. Why do we see the same few markets over and over? It is not because those are the only interesting questions. It is because those are the questions we can quantify. If you cannot quantify, you get endless tsoris, and can’t play for real amounts. By default unclear markets turn into betting on how the judge is going to see the problem, and that is not something I care about.

I’m definitely planning on being less accommodating with nitpicks on market resolutions, especially hypothetical ones, going forward, because time is short and the stakes not so high. Yes, that means you are predicting in part how I will rule. Tough. I don’t trade on my own markets to avoid conflict of interest issues.

Modern buildings are ugly. We made that decision. We woke up, time and again, and we chose ugly. I do not understand how anyone fell for this, but a lot of people did. The cost argument does not check out. I know people actually prefer nice things in practice.

I would offer two other explanations not listed there.

  1. Vetocracy and permitting and regulatory requirements including zoning. If you have to struggle to get permission for every detail of what you try to build, and anyone can say no, are you going to risk delays or refusals in order to create something not ugly? Do you want fights over details? Or will you go with the ugly thing that you know is standard and where no one will complain too loudly?

  2. Externalities. When you create something beautiful, the whole world wins. When it is ugly, the whole world suffers. You do get the brunt of both, but a small fraction of the overall effect. It is only somewhat priced in. It makes sense that you would not invest sufficiently in it. This used to be made up for by people caring about that sort of thing inherently and it granting more status.

For public buildings externalities are sort of priced in, but not fully, and you have even more of a vetocracy and designed by committee issue, on top of the ‘yes someone pulled a con on us and convinced Very Serious People ugly was good somehow’ the article discusses. For private ones, you have both issues.

In potentially a big practical deal, the courts have now ruled that CEQA (California Environmental Quality Act, their version of NEPA) should no longer be construed to give the ‘fullest possible protection,’ a formula that means no one ever does almost anything, and instead treat it as one would an ordinary law. Maybe we can build some things now.

Government actually working: If only the system worked like this more often, in response to a call to extend our insane child car seat requirements to airplanes:

Kelsey Piper: Fun fact, the FAA reviews this periodically and always concludes that, by raising the cost of flying and making more people drive, it would likely increase child deaths.

This is my literal favorite fact about any regulatory body and I cannot shut up about it because so many regulations are written with willful obliviousness to the harms done by making things more expensive and annoying.

Imagine if we went back and analyzed all our existing rules around airplanes, and everything else, around similar principles.

Biden tariffs on China seem quite bad, thanks to Governor Polis for being willing to say it plainly. Taxes on input goods like the 25% on steel and aluminum are madness.

Activists successfully lobby Belgian government to give prostitutes ‘proper labor contracts’ that give them all the protections, rights and procedures you get in the European labor market. Then people realize what those rules imply, and ‘when you refuse to do assigned tasks ten times in six months we call in a government mediator’ suddenly sounds like what it is when you those tasks are often sex acts. If you are going to mandate contracts and salaries and benefits and refusal rights and make it hard to fire workers, that has consequences, and not all of them are the higher prices.

Another brief analysis on the government anti-trust case against Apple.

Ben Krauss at Slow Boring proposes higher education for police officers, both a dedicated university and programs at universities, complaining that our police officers get less hours of training. Oh no, the default reaction should go, more occupational licensing and credentialism and wasteful gatekeeping and signaling, even if as he suggests we don’t increase requirements outright. I very much did not buy the case that this solves any of our real problems.

California rules on wages and fees continue to take their toll on restaurants. The costs add up. I do not however have sympathy for those complaining they will have to bake the fees into menu prices. That seems great. Yes, there will be initial sticker shock, but this puts everyone on a level playing field. In general, the game of ‘everyone is forced to hide the true price’ is a great place for intervention. Ben Phelps has similar thoughts in this Twitter thread.

Why did it take 10 years to open a Trader Joe’s in Hayes Valley? For years they wouldn’t let anyone open a ‘chain grocery store’ anywhere pink on this map:

So they passed particular laws to ‘allow’ a grocery store in an area with no grocery stores. The first time, they couldn’t open until a condo was completed (because shrug) and that took so long the store backed out. Then in 2019 they tried for a Trader Joe’s, but the developer was caught bribing officials to let the development go faster, so it had to wait until they were bought out.

The obvious question is why anyone thinks banning ‘chain’ grocery stores was a sane idea in the first place?

I considered putting this one in Crime and Punishment.

Shirt, raising questions it answers.

European Union has declared itself opposed to encrypted chats, and is trying to pass laws to that effect. Signal has promised they would leave Europe rather than comply. Matthew Green says they are extremely close to proposing such a law. It might have already passed by the time you read this.

Symbolic importance: UK hotels engage in weekly fire alarm tests that everyone treats as not real and they look at you funny if you don’t realize. Never sound an alarm with the intention of people not responding, even or especially as a test.

A big advantage and also big danger of becoming rich and powerful is people get afraid to tell you no. In some contexts, that is great, you get what you want and you can motivate people to do things. When flying in bad weather, not so much.

Kelsey Piper: There are several famous plane crashes that killed presidents where foul play was strongly expected and the ultimate explanation was crew inexperience and a terror of telling the President that what he wanted them to do was ill advised. This is one, this is another.

There are also some billionaire plane crashes with a similar dynamic. Pilots who should have said “no, I am not qualified to safely do that”, who would have said that to an ordinary client.

Money and power can buy a lot of things but they seem actively counterproductive sometimes for purchasing “someone who will tell you that the thing you want is actually a bad idea and they won’t do it”.

This is part of why such people sometimes find it highly refreshing and useful when they find someone willing to tell them no. The problem in the case of planes is that planes are far too safe. So you want the pilot to be bolder than normal. But not too bold.

Macron calls snap elections in France, despite clear danger Le Pen and the far right could win, on theory that the threat of Le Pen and the far right winning means he will win. It probably works, the problem is it sometimes doesn’t. This is a central problem with democracy. Everyone wants to run against the existentially disastrous (in the eyes of their voters) candidate, so they can win, right up until eventually the the disaster happens. Generalize this, including to AI.

Biden Administration to ban medical debt from credit reports. If it cannot go on your credit report, why would anyone pay a medical bill that was not big enough to justify going to court, or at least one they did not feel was fair, especially as social norms around this shift? If that’s true, asks Robin Hanson, who would issue this ‘medical debt,’ and offer services without payment or insurance in advance? Mostly I think all of that is fine. Instead of fake super inflated bills no one consented to, we’d get up front known pricing, and people could take on other debt to pay for it as needed. It’s still illegal to not provide sufficiently urgent care either way.

The alternative is to continue with billing like this, where an ER visit costs $2215 for ‘the visit,’ $1200 for a nurse’s swab of a 3 year old’s throat for a covid/strep test, $740 for two minutes with the doc, then the ‘cash pay’ is $685. End this scam.

Flo Crivello reports from time at Uber eight years ago (so things may have changed) that for finding shortest routes, Apple Maps was best, followed by Google Maps, and Waze was far behind both. Waze perhaps makes people feel smarter and in the know, but it is too clever by half and did not (at least then) actually translate into faster routes.

Why did Google never implement a ‘nicest route’ button? Because people might use it to select nicer routes, thus choose to give foot traffic to richer areas. So they decided to hide this information from their customers to avoid this.

If it had ended here it would have been purely for the popcorn: A conversation between Yann LeCun and Elon Musk, part one.

Then… well…

People will actually tell Elon Musk he has never done Science and will die bitter and forgotten because he did not publish, or did not publish in the proper scientific journals.

After a highly justified roasting all around, Yann quickly retreated back to the Motte, which is far more reasonable.

Yann LeCun: So much misunderstanding of this comment!

Here is a list of things I am *NOTsaying:

– you need a PhD to do Science. You don’t. A PhD teaches you to do research, but you can learn that on your own (though it’s much easier with a mentor).

– you need to get papers accepted by a journal or conference to publish: you don’t. You can just post it in http://ArXiv.org. Many influential papers never went through the formal peer review process, or went through it after they became influential.

– engineering is not science: it can be, depending on your methodology. I’m a scientist *andan engineer. These activities are complementary and need each other.

– science requires formal papers: it doesn’t. A clear explanation on a website and a piece of code on a public repo will do.

What I *AMsaying is that science progresses through the collision of ideas, verification, analysis, reproduction, and improvements.

If you don’t publish your research *in some wayyour research will likely have no impact.

These are very different statements. No, the first statement did not say ‘all you have to do is put it up on ArXiv.org.’ I love this illustration of the classic two step, the flip between science and Science™. The difference between ‘you have to tell people about your discovery or they won’t know about it’ and ‘if your statement hasn’t gone through proper peer review in the official channels then I can act as if it isn’t real.’

I would be thrilled if we could all agree on that second thing. Science is where you figure things out about the world. When the guy in the shirt says he will do science to his cookie, he speaks truth.

If you then want to add to the light of science, then you also have to tell other people your findings.

That’s it. No gatekeeping.

Or as Xkcd famously put it:

Say what you want about Elon Musk, but admit the man ships and he experiments.

Similarly, here’s that quote from Katalin Kariko’s book, Breaking Through. She still got mRNA vaccines to happen despite being driven out of her position for trying, and this thread from St. Rev Dr. Sev explains that weirdoes like her who think science should be about doing actual science are not to be tolerated going forward by those who only know Science™.

Goro Shimura: The thing that bugs me about a lot of the replies to this is the number of people (mostly American) looking at what is clearly meant to be a description of rank obsequiousness mixed with self-promotion and saying “but of course these are just basic social skills”

St. Rev Dr. Rev: A whole bunch of Leading Scientists with Professional Headshots on Twitter Dot Com are extremely buttmad about this quote. Genius is a dime a dozen, they are saying. Science is about project management and filling out form!

Well, Science is about that now, anyway.

I reflexively blocked the ratfucker who said the thing about genius so I can’t find it now, but check out this other ratfucker. If genius can make a difference in your field, it’s immature!

Kariko revolutionized her field in the teeth of people like this, and they will never forgive her, and they will fucking destroy the next Kariko they get their hands on.

An unspoken conspiracy of mediocrity. The purpose of Science is to turn grant money into papers, nothing more. Actual progress threatens to disrupt a lab’s business model. Can’t have that.

The greater part of modern science (by staffing levels, at least) is worthless bunk.

But when everyone’s a fucking high-agreeability pod person, you don’t filter the trash once it’s clear that it’s trash. That would be unmutual, it would interfere with the flow of grant money. So the intellectual trash piles up. That’s good leadership and community service.

I grew up reading about how Science was done in the mid-20th century. My mom worked in a cancer research lab herself. Disagreeable weirdos have always been critical to scientific work. Purging them because they make the conformists uncomfortable is a fairly new development.

St. Rev. Dr. Rev.: So this thread Took Off, as they say, and a lot of people dug it but some people got really nasty, like, ‘oh you think you’re BETTER than other people, like you don’t need to FIT IN, like you should get money for free’

I think Katalin Kariko is better than other people.

More fun ‘things are not science’ here.

If you think Science™ makes good funding decisions on the merits, well:

Julian Togelius: This Dutch study finds that finds that panelists make the same allocations of research fundings even if they don’t get to read the actual proposals, just abstracts and CVs. This result *shouldhave a large impact on science funding policy. (h/t Thore Husfeldt)

Abstract: Do grant proposal texts matter for funding decisions? A field experiment

Scientists and funding agencies invest considerable resources in writing and evaluating grant proposals. But do grant proposal texts noticeably change panel decisions in single blind review? We report on a field experiment conducted by The Dutch Research Coun- cil (NWO) in collaboration with the authors in an early-career competition for awards of 800,000 euros of research funding.

A random half of panelists were shown a CV and only a one-paragraph summary of the proposed research, while the other half were shown a CV and a full proposal. We find that withholding proposal texts from panelists did not detectibly impact their proposal rankings. This result suggests that the resources devoted to writing and evaluating grant proposals may not have their intended effect of facilitating the selection of the most promising science.

Julia Togelius: Far too much time and effort goes into writing and reviewing grants. The grant funding system also distorts priorities, rewarding faculty for spending their time writing grants instead of doing research. It’s the worst part of academia.

I think we should simply do what it implicitly suggests: replace grant proposals with submitting abstracts (maybe half a page or so) and CVs. Plus some regularization to ensure a more even spread of grant money. Better for everyone.

“But what about the new investigator that has no track record but a brilliant idea?”

  1. Specific grant schemes for new PIs, as already exists

  2. Research is a social endeavor, you learn it and get a track record by collaborating with others

  3. Brilliant ideas are a dime and dozen

In other words, Science™ does not care about the details of your research, and this good, actually, we should stop wasting time with that and allocate money based on your social status.

Thus is proposed by Ruxandra Teslo this law, after explaining that failed corporatists are forcing the weird nerds out of academia: Any system that is not explicitly pro-Weird Nerd will turn anti-Weird Nerd pretty quickly. Most would-be Karikos, including the ones who are not somewhat crazy, are driven out.

Another sign of how things are going, yes the study data is posted online.

Ben Landau-Taylor: In 2023 Ian Hussey tried requesting data from dozens of academics who promised “data available upon request”, and found they were LESS likely to share data (17%) than authors who did not make any promises (26%).

Over and over again, when we check the parts of today’s academic process which can be inspected, it turns out that there’s nobody home. The parts which are harder to inspect? Well, I’m sure those are fine.

The rationalist term is ‘front running steel man, for German Claude suggests Replikationsmangeleinsichtsanerkennung (‘acknowledgement of the insight despite lack of replication’):

Tess: There should be a German word that means “I see where you’re going with this, and while I agree with the point you will eventually get to, the scientific study you are about to cite doesn’t replicate.”

Paper from the Federal Reserve Bank of Dallas estimates 150%-300% returns to government nondefense R&D over the postwar period on business sector productivity growth. They say this implies underfunding of nondefense R&D, but that is not right. One should assume decreasing marginal returns, so this is entirely compatible with the level of spending being too high. I also would not assume conditions are unchanged and spending remains similarly effective.

What are the load-bearing posts of our time? Only one way to find out. Recommended thread if you haven’t yet. I am sad you can’t easily find all the quote tweets.

TikTok gives different people completely different comment feeds on the same video. Woman gets comments supporting female video creator, man gets comments supporting the creator’s boyfriend instead. Evil genius.

fabian: the final stage of web2 social media is that everyone is heaven banned

maybe not enough demand yet to enable more controls, but maybe just too crude tooling?

let folks tap more seamlessly into different simclusters, view feed as-redneck/feminist/techbro/nigerian-communist

TaraBull: TıkTok is dividing people by curating entirely different comments to us.

Do you look at comments to gain perspective on social media?

Was this purely the ‘people you follow or engage with show up first’ principle being strong enough if you spend too much time on the platform? I very much doubt it.

Ragged man stands up, says this anything beyond that should be against the rules. Everyone gets different feeds, but aside from actively connected specific accounts we should mandate everyone gets the same comments sections, unless you are being intentionally heaven banned.

You can still gain perspective from the comments on videos even so, but you need to be properly calibrated, and understand you are seeing a customized reaction. How that compares to your expectations is less information, but still information.

You want more evidence TikTok is an enemy of America?

It hates us so much it banned anyone who helped promote Ozempic, without warning, under the ‘Integrity and Authenticity’ policy, in particular the ‘might be helping Americans be better off’ clause.

“We want TikTok to be a place that encourages self-esteem and does not promote negative social comparisons,” TikTok says in a preface to the rules.

That’s right, yes, not letting people say a healthy weight is good is an actual CCP op.

And yet, the algorithm knows all:

Stephanie Boyle: I’ve seen all of these creators on my fyp. I usually see them complaining about being banned which I often find mildly amusing. If they were banned or shadow banned, I wouldn’t see them I would think!

The market only has a 33% chance that TikTok will actually get banned, despite ByteDance having revealed it won’t be allowed to divest (I bet nominal yes purely for tracking purposes and don’t have a strong opinion).

Liz Miele got flagged on YouTube for hate speech on her latest special Murder Sheets because she playfully calls her own cats the C-word, despite their policies not even listing the word, with no way to fix it, cratering her ad revenue. I was at the taping of this special, and calling that hate speech is completely absurd. This feels like an AI Solves This problem, and also a Human With a Brain Solves This problem? Yes, perhaps for people with 8 subscribers and 31 views you cannot afford to check when someone appeals, but this is very much not that. The good news is that enough people heard about this that one of them found someone who could hear her appeal, and they fixed the problem. Yay.

Did you know that if prominent people give you retweets, you get more views and likes? Yeah, least surprising economics experimental finding ever, and that’s saying something. What is more interesting is that getting the prominent economist retweets of job market papers actively did boost flyouts and job offers, women receiving 0.9 more job offers. Which is a lot of job offers given you only ultimately need one and the average for controls is 1.5.

Paul Goldsmith-Pinkham: The average control individual in this sample is an individual who has 11 tenure track and 16 total interviews, 5 and 3 flyouts, and 3 and 1.5 offers. Notably, being URG doesn’t predict (significantly ) on any of these outcomes for the control.

Why does it work? Here is one guess.

Paul Novosad: An explanation could be that the candidate search is EXTREMELY random.

We get 1000 applications at Dartmouth, and our administration requires that the same 3-4 people review every single one.

It’s an overwhelming task. It’s inevitable that people make quick decisions — as happens in college admissions and all other kinds of job hunts too.

Any kind of positive signal at that first stage could increase your odds of moving forward substantially.

Never mind social media, is the internet bad for you? The study says mostly the internet is good for people, actually (ragged man stands up at meeting), although in some data sets women ages 15-24 in particular are worse off. I am not in the Patrick McKenzie camp where the internet is by a mile our greatest invention, but yes, the internet is pretty awesome and I am very happy to have it. Also I agree with the commenters that any such study is hopelessly confounded.

New York passes law making it illegal for social media websites to provide ‘an addictive feed’ without ID verification. It is called, of course, ‘SAFE for kids act.’ Also parents will be able to pause notifications for their kids from 12am to 6am (okay, I guess), and ban selling data from users under 18. Doesn’t seem great, plausibly unconstitutional, and it is always weird when people say ‘you cannot collect our data’ and then require age verification.

Nate Silver: The [Twitter] For You algorithm is pretty good at picking up on your revealed preferences so if you’re complaining about it, you’re kinda telling on yourself.

It measures your interactions, so you are telling on how you choose to interact. We are forced to be disciplined in how we react, lest the AI gives us what we on reflection do not want. We now have to exercise this kind of strategic thinking with every online interaction. It is exhausting.

Twitter porn bots. Hard to catch?

Michael Nielsen: Can someone at Twitter anonymously explain to a reporter why the pornbots are being allowed to proliferate? (I presume it’s because Elon thinks it’s funny?)

Paul Graham: Apparently they’re hard to catch. I know this seems implausible.

I roll to disbelieve. I could believe that porn bots that are trying maximally hard to not be caught are hard to catch. I flat out refuse to believe that the actual in practice bots on Twitter are hard to catch. The bots are so unimaginative that I’ve gotten the exact same message about a sister looking to date 10+ times, the same exact crypto messages. The porn bots 90%+ share several very clear characteristics.

I have an alternative theory. Now hear me out. Twitter is choosing to have bots that are trivial to identify. If they crack down, then the bots get sneakier, and actual humans have to spend time on them rather than recognizing in 200 milliseconds that it is a bot. Better, they have decided, to do a phony war that doesn’t actually cause much stress or lost time. It’s crazy, but not as crazy as it sounds.

Could it be as dumb as?

Tyler Young: Some of them are sophisticated. Some are very much not. My bet is that Twitter has no interest in solving the problem because the bots boost their engagement metrics.

I cannot rule it out. I mean, you’d think it can’t be this stupid, but it can. At some point, making the insurance fund an actual random number is less harmful than making people miserable in order to create a more credible fake number.

Patrick McKenzie sees them as a visibe test of non-state capacity, similar to cleanliness at McDonalds.

Twitter made likes private. Note that even if there are no flaws, it is two-way private. The person whose Tweet you liked knows it was you, which is vital to Twitter functioning.

Paul Graham: Instant 10% increase in likes. Large numbers of people must have different opinions than they dare express, to move the total number of likes by that much.

The problem is that people have literally gotten into big trouble or been attacked out of context for merely liking the wrong Twitter post. Whereas the upside of liking a post is very small, and also people might look at your list of likes to find good content.

Stuart Buck: One downside of Twitter making “Likes” private is that one of the most interesting ways to find new ideas/tweets was to go to the “likes” of someone you admire, and see what they had been reading lately.

I occasionally enjoyed seeing the “likes” of John Arnold, Patrick Collison, and others. Lots of overlap with the stuff that I read, but it would regularly turn up interesting ideas/people that I hadn’t seen.

So it makes sense to now be modestly less selective, also it could easily be a temporary bump from the new policy (‘I can like everything I want now, mwahahaha’).

Michael Keenan: Like everyone else, I’d rather they make this optional per Like. A side benefit would be that we could see a tweet’s public:private Like ratio, which would measure taboo strength. We’d see what taboo topics are ready for an information cascade.

Complexity is bad and choices are bad, and a ‘private like’ carries a weird implication. Not being public with your likes could be seen as a kind of ‘evidence of guilt,’ even, or you could be blamed for being public or private. I am not excited to split the baby here, but it does solve some issues.

Violet Blue: So now scammers and bots can artificially inflate post popularity and no one can verify if likes are from any real accounts. A gift to influence ops.

Shoshana Weissmann: This is a REALLY good point. This is another huge use of checking likes.

There was once a company opposing R Street’s work. All the likes were bots and weirdly the like count fluctuated throughout the day. Now we won’t know.

Yep. Public record of likes lets you understand context. What type of engagement is happening here? Who is liking this and who is not? It is rarely the best use of one’s time, but occasionally it was valuable, as would have been tasking an AI with this.

Beff Jezos notes likes often said ‘I understood this post’ and regrets that this is gone, or flagged things for their followers, and the new world will only reward those who cater to the center of mass rather than the tail of intellect (virtue of silence goes here). The first use should mostly still be intact, since the author still knows. I do think Jezos has a point here, but that this does not shift the balance of power all that much. Already Twitter favored the middle quite a lot.

That could be part of the motivation as well. If your likes are public, an AI can use that as data in a way humans could not do at scale.

Scott Alexander on the Far Out Initiative, a quest to abolish suffering by altering neurotypes rather than the usual proposed method of omnicide. The claim is that Jo Cameron is incapable of any form of suffering, and she’s otherwise mostly fine, only a few minor accidents, she still seems to do things and care about things, it’s great. So let’s do that for everyone and ow who put that stupid fence there?

I always view focus on suffering in general, especially when viewed as The Bad, as at great risk of asking the wrong question. Suffering is highly correlated with things sucking, and provides valuable information that things likely indeed suck and in exactly which locations and ways they suck. This is highly useful, both as information and motivation.

That does not mean we currently have the correct level of suffering in response to things sucking, or that a lot of our suffering is not mismatched. Nor does it mean that the suffering does not make things suck a lot more than they need to.

That is a roundabout way of saying the right amount of suffering is probably a lot lower than the human norm under current conditions, let alone those who report constant suffering, but the right amount is not zero. I do not sufficiently buy the ‘you can vary how happy you are instead’ counterargument. Negative reinforcement should not purely be the lack of positive reinforcement. A knob to lower this setting would be immensely valuable, but yeah, I worry a ton about what people would do with it.

Here is a question that is not so correlated with that, entire history of the question:

Stefan Schubert: Most people are not unhappy. [then he shows this graph]

Danielle Fong: It’s fascinating how un-impacted this data series is by basically anything.

Matthew Yglesias: It’s fascinating how un-impacted this data series is by basically anything.

How do I know? Because ‘lol nothing matters,’ to this extent, is not a plausible hypothesis.

Are you telling me 2008 did actual nothing? That 2020 did actual nothing? Phones?

Yeah, no.

My explanation is that this question is being answered in relative terms. You aren’t equally happy during a pandemic or financial crisis, but that is not the question being asked. How your personal life is going is a question that mostly rules that stuff out and is judged compared to other people around you, and we are mostly risk averse and willing to accept somewhat below average performance, so we consistently bat around 80%.

Here’s what Stefan was responding to:

Tim Denning: Most people are unhappy.

So, I’ve spent 20 hours watching Bill Murray interviews over 3 months.

What did he find? Organizing for space:

  1. Forget trying to be famous, try to be rich first.

  2. The more relaxed you are the better you are.

  3. Be weird as hell, crash random events and parties.

  4. Tell everyone you are retired.

  5. Most mental health advice is too serious.

  6. It’s hard to be an artist. It’s hard to be anything. It’s hard to be.

  7. The automatic things you do are basically those things that keep you from doing the better things you need to do.

  8. Whatever you do, always give 100%. Unless you’re donating blood. Giving a sis underrated, the competition is weak, most people never try.

  9. Melancholy is kind of sweet sometimes.

  10. It’s not that attractive to have a plan. Focus on being resourceful, not clever.

  11. It just doesn’t matter! People worry about dumb stuff. Go do epic stuff.

  12. You can tell how boring a person is by the lack of fear in their eyes when someone is flipping through photos on their phone.

  13. Just beat my record for most consecutive days without dying.

  14. People say I’m difficult. Sometimes that’s a badge of honor.

Strongly agree: #1, #2, #5, #6, #9, #13.

Directionally agree, good advice, but not strictly consistently true: #3, #7, #8, #11, #14.

Not buying it: #4. Never retire. Maybe tell people different, sometimes?

Actively disagree: #10, #12. You need a better plan, and it is boring to take photos rather than live, although I am considering changing that somewhat because of AI search and using the photos as a kind of memory bank.

Given the non-obviousness level here, that’s a very good hit rate.

Jerry Seinfeld’s commencement address at Duke was very good. So was his appearance on Honestly. It is fascinating how much more interesting I find Seinfeld when he is not on stage, compared to how he did when I saw him at the Beacon Theater.

Ruxandra’s post claiming that autists (rationalists being the chosen example) and the Internet will defeat the monoculture. I do not see us bringing down the monoculture (at least not via non-AI methods). The monoculture need not much care that there are a handful of people off in the distance doing its own thing, and indeed it will come for such groups in time, and it has. If all the major tools and attention is monoculture, and there are a bunch of small splinter factions that occasionally get to add some concepts to the monoculture, that is better than not having the factions but mostly still monoculture.

Polymarket raises $70 million including from Vitalik Buterin and Founders Fund. As is noted in the announcement, Polymarket is often now cited as a news source, since it is the largest prediction market on major events even without American participation.

Note that they are crazy low on Biden, having him at 34% (!) as of this writing, with Trump at 56%. Whereas Manifold has Trump 52% versus Biden 46%. Adding to 98% is slightly too high, but adding to 90% is clearly too low. In general Polymarket is biased towards Republicans. The obvious play is to take the gift directly as they (at the time) had Biden dropping out at 24% (!?!) versus Manifold’s single digit levels. Yes there is some chance you lose and nothing is ever investment (or gambling) advice but hot damn. Remember always that such changes persist, so you are probably stuck holding until election day. Or, perhaps, somewhat after it.

Review of a new book on basics of Bayes, looks promising.

A look from Dylan Matthews inside the INR, a federal intelligence agency that uses a small group of dedicated domain experts (as opposed to places like the CIA where everyone rotates every few years) and got Vietnam, Iraq’s lack of a nuclear program and the early stages of the Ukraine war right. Which would have been a lot more useful if anyone had listened. Of course, they are far from perfect.

Dylan Matthews: For their part, INR veterans tend to be less triumphalist, preferring to say they were merely “less wrong” than other agencies. They agreed with other agencies that Iraq still had biological and chemical weapons, and they got that wrong.

The article is full of INR wins, and notes some INR losses. It is ‘contrarian’ because it does not bow to government consensus and is proud of dissent. Alas, they are being shrunk, and they are paid poorly. It is going to be tough. And their methods depend on far too much confidential information for us civilians to tap their expertise.

News you can use: A map of the public bathrooms in New York City.

The bees are fully back.

Topher Stoll: This is the hilarious tragedy that plagues all of human endeavor. If we rally to fix a problem in time, idiots will come out of the woodwork to say that there was never a problem to begin with. See also: Y2K, the Ozone Layer, global food supplies, “peak” oil, Acid Rain.

One day, god willing, some incurious doofus will be able to say with a straight face-

“Pssh, climate change was NEVER a danger! All our energy is renewable, the geo-engineering is going great, and we’ve restored 90% of habitats around the world.”

That’s the dream.

Yep. Always the dream. Ideally we’d be measured before and appreciative after. Alas, it almost never works that way.

Tyler Cowen recommends reading about a specific business you know a lot about already, or if that fails about the business of a sports team or musical group that resonates with you, as opposed to books in the ‘business section.’ As he says, the part about not reading ‘business’ books is the easy insight. The hard part is what is good. Here I worry that there are too many important differences between superstar competitions and other practices, and thus if you are not careful you would learn many wrong lessons. But I do agree that looking into areas you know is great here.

Tyler Cowen book recommendations: Olivier Roy’s new book The Crisis of Culture: Identity Politics and the Empire of Norms was a very strong one. He also suggests In This Economy: How Money & Markets Really Work by Kyla Scanlon.

Also he says in Cape Town you reverse the usual rules and eat at restaurants with beautiful women facing the waterfront, because everything is cheap and you want to be part of the class of people with money. Order the seafood.

He does not mention this, but the right next question is, how does this generalize?

A Twitter thread guide to hosting gatherings. This model says: Look for people who are interesting and are interested in others, never invite people because you feel obligated. Curation of people is key. You only need 14 square feet of active party space per person. Create talking spaces where people face each other, ideally limited to 4-5 people. Warm bulbs for light, make a playlist, mostly don’t sweat it.

Some related good advice on community spaces:

Tetraspace West: I think my hard won community management advice is:

  1. Laissez-faire and free speech is for strangers; your walled garden is tiny and low-stakes, be ruthless.

  2. Not *technicallybreaking the rules is breaking the rules.

Your discord server can maybe have an #appeals channel, if you know what you’re doing; if you start creating something that looks like a legal system, you’re copying intuitions from systems much larger and more alienated and less designed than yours.

A justice system is based on the principle that punishing innocent people is very bad, and decisions must be objective. In many situations, those should not be priorities.

Also good social advice:

Elizabeth van Nostrand: I’ve know of several people who violate social rules a lot and tell people there have been no consequences. They are wrong about this.

It might be true that it’s a good trade off for them, but I also know of opportunities they otherwise would have been offered but weren’t because they were considered too hard to work with.

Long ago I read a blog post about a clerk at a porn rental store (so, really long ago) about a karma system he + coworkers implemented. They had a fair amount of leeway around late fees, and if you were rude to them or another customer it would never be used in your favor again.

Like a note went in your Permanent Record at the porn rental store that you were mean and they should be mean back.

The justice feels delicious here but no one was being made a better person by that so mixed feelings. See page 52 of this PDF.

Examples of rules broken: arrive within an hour of when you said you would most of the time, don’t yell at people or call them names, don’t constantly ask for favors from near-strangers and if you do at least be really nice about it.

Oh and my favorite “starting projects other people depend on you can’t complete, forcing others to rescue you.”

Also sometimes they people lie. I’ve heard people forced out of multiple spaces that were deeply important to them, tell others they’d never faced consequences for being too X.

Paul Crowley: This is a great caution. You often won’t know about the invites or kinder treatment you didn’t get because someone noticed you violated a rule. They often won’t tell you. Also, rule-violators lie about this stuff.

I have known more than one example where a whole circle of people have known that someone is a liar, but no-one tells them to their face, and they very likely think they’re getting away with it.

Quinn Que: An easy example of this is being blocked by people you’ve never interacted with on social.

Paul Crowley: I block like this a lot!

Five models of how to live near friends, from Cabin:

I strongly endorse the Apartment Cluster. I have some small experience with this, having had one friend living our building. It was awesome. It is hard to overstate the extent to which not having to go outside meant we had more interactions. Same floor would have been another big jump. Trivial inconveniences matter.

The best part is that this is easy to do. In any big building there will be openings over time, and presumably you chose the place for a reason. Alas, our problem is that those we know always wanted to live in cheaper locations than we did, so we couldn’t make it work.

Yes, you could do this in reverse via ‘meet your neighbors,’ but these days it is difficult, and it turns out most people are not on the same wavelength. The people in the next apartment are lovely, but we have so little in common. It is hard to make that work these days.

Minihood is the classic version, potentially even easier, and the one that was pulled off in Berkeley. Again, exact proximity matters a lot. You want easy walkability.

The duplex dream is a step up from both, if you can pull it off. ADU is a stretch.

Micro-village is often the dream. I have seen much talk of it over the years, but no one ever seems to get that close to pulling it off. Coordination is hard. From what I can tell, this will only happen if a small subset is willing to take the risk and do the work, and then offer to let others follow. You will also need easy access to civilization.

I am late to the party on this one due to other concerns, but still seems worth weighing in. By all means skip if you consider this old and busted.

The FTC have decided they are the fairness department. They decide what is fair. If they decide your agreement is not fair, that agreement is null and void. If you don’t like it, tough, because life it not… well, you know.

In this case, the thing that they have decided is not fair are noncompetes.

Dave Michaels (WSJ, April 23): The Federal Trade Commission on Tuesday banned employers from using noncompete contracts to prevent most workers from joining rival firms, achieving a policy goal that is popular with labor but faces an imminent court challenge from business groups.

The rule prohibits companies from enforcing existing noncompete agreements on anyone other than senior executives. It also bans employers from imposing new noncompete contracts on senior executives in the future.

Noncompete clauses violate a 110-year-old law that prohibits unfair methods of competition, the FTC says.

Outlawing noncompetes is hugely popular with many workers, and the FTC estimates that its rule would boost their earnings by $400 billion or more over 10 years. Cosmetologists, who earn about $35,000 a year according to federal data, say noncompete agreements are a drag on their earnings.

The move, approved 3-2 by Democrats on party lines, is roughly 50% to be upheld after all appeals.

Pacific Legal is suing on the highly reasonable grounds that this is none of the FTC’s business and these agreements can be good for both parties by enabling training.

Austin Campbell is one of many celebrating this decision, calling it an earth-shakingly massive win for free markets and capitalism to deny this freedom of contract to deny one future freedom of contract. In practice, he argues, noncompetes are highly abusive and workers are put in horrible situations.

Like many, he argues that this isn’t a free contract because many don’t know what they are agreeing to. It doesn’t have to be that way. A noncompete is a fairly straightforward thing. I once signed one that the employer refused to waive or even to let me buy out of or negotiate about, and that I decided to honor, and it sucked, but I did not have a lawyer and I was not for a second confused on what I was signing.

Did I check if it was enforceable in my state at the time? No, because a contract is a contract is a contract, I knew what I agreed to, and I was not about to break my word even if I wasn’t legally bound to it.

The flip side is studies show workers don’t understand and do not bargain with the noncompete in mind. Which seems crazy to me, but also shouldn’t obviously matter if employers are competing for workers? Then there are workers who aren’t aware they even signed. That I agree should not be allowed, you should have to be very clear that this is a noncompete and on what it applies to.

Here is Luke Herrine sharing a bunch of examples of workers who got screwed by noncompetes.

Others complain of an equilibrium where most employers insist on noncompetes, putting workers in a terrible position. The next question is, why doesn’t one employer compete by offering lower wages and not requiring a noncompete, if that is better for workers?

  1. One possibility is that we are up against the minimum wage. If that happens, then yes, employers will have to compensate with other terms, and banning these agreements is a lot like raising the minimum wage further, and likely the superior choice. It certainly seems like there should be some wage floor on new noncompetes to avoid this, substantially above the minimum wage.

  2. Another possibility is that the employees, whether or not they know what the agreement says, are wrong about what the agreement is worth to them. Like in many other places, they focus on the headline number and other short term factors, and don’t properly factor in the noncompete. Alternatively, they are liquidity constrained so they have to make tradeoffs.

  3. A third possibility is that you don’t want the employees who are more inclined to refuse to sign noncompete, because they are the ones who will leave and compete with you, so the equilibrium is everyone has to sign even though that’s not net good. That would be a story where intervention makes sense.

  4. Another story like that is if competition and dynamism are largely public goods. So the employee and the employer can make a deal that leaves both better off, but it makes everyone else worse off, so you want to tax or ban it. Possible.

Betsey Stevenson is on the ‘victory for the economy’ side.

Tyler Cowen refers back to his post from January 2023, where he argues noncompetes can be welfare enhancing. His argument is straightforward. If you can go to a competitor tomorrow, I am forced to silo my trade secrets and other information, and I will invest less in your skills. At the low end, noncompetes seem net negative, but we shouldn’t be too restrictive.

Alex Tabarrok agrees with Tyler on the merits that the proposed ban is too broad, and also questions the FTC’s authority. As he points out, the FTC’s claim that banning noncompetes will raise wages ignores that this is part of the worker compensation basket. By default, we should expect wages to go down short term. My response would be the FTC is abrogating existing contracts, which effectively raises the wages and bargaining power of the impacted workers, which means the short term impact could indeed send wages higher. Alex buys the externality story, though, so he is willing to give the change a try.

Another story I buy is that noncompete agreements can be observed and enforced whereas NDAs make this a lot harder, so often noncompete agreements are substitutes for NDAs.

Arthur Breitman: On the FTC… in a few serious industries non competes aren’t about depriving the competition of talent or even employee retention, they are largely a stopgap to make NDAs de facto more enforceable. Of course we’ll hear the contorted explanations from a cohort of Silicon Valley “libertarians” that it’s a great policy, because that’s part of the local lore, but it ain’t. There are industries where trade secrets are far more valuable than the broad Internet tech sector, and the alternative to trade secrets are patents.

Your periodic reminder to file under ‘and then they voted.’

Aaron Blake: The NYT/Siena poll shows 37% of Trump voters say Trump is most responsible for the Supreme Court overturning Roe v. Wade.

24% … say *Bidenis most responsible.

‘If it happened on your watch it is your fault’ is a popular heuristic. This makes it very difficult to make good policy decisions.

Scott Sumner on aging and looking back on the past, recommended.

The mirror of Jerry Seinfeld’s graduation speech is Chiefs placekicker Harrison Butker’s graduation speech, that of a traditional Catholic saying what many traditional Catholics actually believe to a college dedicated to traditional Catholicism, no matter what you think about that. People with different worldviews got mad at him.

Cable! Get Netflix (with ads), Peacock (with ads) and AppleTV+ for $15 a month, if you already have Xfinity TV or internet. I hate that this is with forced ads. Ads are the bad equilibrium. People should work a bit more, then pay the extra money, everyone is better off. Alas, when packages form, the ads seem unavoidable, because if people want discounts everyone assumes you must want the discount more than you want to avoid the ads.

Give me the version that packages and integrates all the media services so I don’t have to rotate and app shift, with zero ads, at a modest discount to the combined price (let’s say $200/month for Netflix, AppleTV+, Hulu, YouTube and YouTubeTV with the sports channels back and ad autoskip, Paramount+, Peacock and Max, ideally throw in various side subscriptions), and I will be all ‘Where do I sign.

I have active reasons I want each of those. Instead, right now, I’m ‘supposed to’ be rotating between them, and they’re (largely correctly) counting on laziness to stop me, so I only partially bother, and I’m missing several of them.

The SMBC theory that you should maximize the vector sum of your life and your work, which is why so many great artists, scientists and philosophers are ‘huge dickwads with tortured lives,’ they get little value out of life so they focused in on work and achieved greatness. This reminds us that for those with the talent the Great Work has highly increasing marginal returns. We would be better off if there were more people who went fully all-in on the Great Work. They should be rewarded and supported more, and (up to a point, but a high one) forgiven their personal trespasses.

Uber does pass on tips to drivers, but its interface implies heavily that it doesn’t so Bill Ackman’s Uber driver thought they were being stolen. This is a bizarre own goal, why would you do this? They also taking a huge chunk of the actual fare. Claude says typical is about a 25% fee. That is in some sense outrageous, but it still exceeds the consumer surplus from being able to use an app and it isn’t remotely close.

Aella reminds us of a great rationality technique in such situations. When you see a claim or headline, ask what the world would look like if the claim was true.

As I’ve said before, the repugnant conclusion is based on a fallacy in its core argument, but another distinct problem with the repugnant conclusion in practice is that it leaves you little margin for error.

Amanda Askell: Being averse to the repugnant conclusion makes sense. Unless you’re omniscient, a googolplex lives at +1 utility is indistinguishable from a googolplex lives at -1 utility. Better to have fewer clearly positive lives to reduce the risk of accidentally bringing about a hellscape.

This is a good principle in general. One wants to have a bias towards action and living and being, to balance out most people making the opposite mistake, due to the value of experience, story, skill and exploration and such.

Ultimately most of the value comes from things that are very clearly valuable. If you cut out all the marginal stuff that isn’t required to match some need, you are making only a small mistake.

Nick reports a third of women he is close to dream of opening beautiful bookstores with cafes, and Tokyo says doing things like that is awesome, so how can we make this easier? My presumption is they dream of doing this and also somehow being able to make a living and raise a family. Alas, the economics of bookstore cafes are not good, even if you solve for zoning and rent costs and get rid of a bunch of dumb regulations. And also what they want is to have the bookstore and cafe be there and to hang out in it all day, rather than do the actual background work of running it. The alternative plan is ‘these people would do the fun parts for free,’ which Nick proposes, but do they have that ability?

I’m sorry I must report that the principle here is right, but of course there are obvious exceptions, although mostly to the first clause.

Paul Graham: If it starts “I’m sorry I” it’s a genuine apology, and if it starts “I’m sorry you” it isn’t.

New movie ‘The Apprentice’ chronicles part of the rise of Donald Trump, well before his political adventures. Dan Snyder, former owner and abuser of the Washington Football Team, joined the Canadian, Irish and Danish government and others to help finance it because he thought it would be flattering, then turned around and fought its release (intended for this year ahead of the election) when it turned out to be attempting an accurate portrayal.

Sources familiar with the back and forth say Snyder took issue with multiple aspects of the film and weighed in on what should be changed.

Despite its title, “The Apprentice” doesn’t chronicle Trump’s years as the star of the hit NBC reality show that catapulted him into the Oval Office. The logline provided to press calls the film “a story about the origins of a system … featuring larger-than-life characters and set in a world of power and ambition.” It adds, “The film delves into a profound exploration of the ascent of an American dynasty. It meticulously charts the genesis of a ‘zero-sum’ culture, one that accentuates the dichotomy between winners and losers, the dynamics between the mighty and the vulnerable, and the intricate psychology of persona.” 

Trump has not yet weighed in on “The Apprentice.” (He did not respond to a request for comment from Variety.) One insider says, “it would be like a gift.”

I would have a prediction market on whether Trump will weigh in, except what would be the point, when has Trump not weighed in?

Trump is certainly all about the zero-sum culture and winners versus losers.

Which level are you playing on?

Yosarian Two: Chesterton’s meta-fence: if you’re walking in the forest and you see a bunch of people removing a fence, you can’t invoke Chesterton’s Fence until you know why they’re removing the fence.

Matt Neary: Chesterton’s fence still applies at object level. You should inquire why they’re removing it and confirm that they are aware of its original purpose.

Yosarian Two: Inquiring is never a bad idea, but it’s worth keeping in mind that the fence, the guy building fences, the people removing fences, the process by which people decide to remove fences, etc, are all existing systems that exist for a reason. It might or might not be a good one.

Pasha Kamyshev: You can always go one level of meta more: If you see people invoking “Chesterton’s Fence,” don’t un-invoke it, until you understand why they invoked it.

Lyn: what if you see Chesterton removing his own fence?

Yosarian Two: Then you have to ask him both why the fence was there in the first place and why he’s removing it. Unless there’s a cultural fence against bothering Chesterton on his own property about his own fence which there probably is.

nihilism disrespecter: reverse of chesterton’s fence also true: don’t try to RETURN to something your ancestors abandoned until you understand why they abandoned it.

Do not in general assume people know what they are doing or why they are doing it, unless they are doing something local and practical. The question is, which act is removing a fence and which one is not?

I do not think we can let this one go.

Anya Martin: I know it’s dunking on a dead horse but… if the fundamental issue is that people are too poor to have a nutritionally balanced diet, & a product is invented that makes a nutritionally balanced diet affordable & accessible, then that literally does address the fundamental issue.

Seth Burn: I think there has to be no limit of the dunking here. At this point Greenpeace is being actively evil, and that should be recognized as such.

Maia: Anti GMO types will be like “Oh, you support alleviating poverty? That pales in comparison to my preferred strategy, eliminating poverty” and then not eliminate poverty

Niels Hoven: Oh, you invented a cheap nutritious food to alleviate global hunger?

Sorry, that doesn’t address the fundamental issue: that even in 2024, people still have to eat and drink to stay alive.

We had the fun claim going around from The Guardian that ‘12 percent of the population of Phoenix, Arizona will die of extreme heat in the 2030s.’ I would respond explaining why this is Obvious Nonsense, but as I noted on Twitter I have been informed by some effective altruists that dead horses may experience qualia.

And we have Just Stop Oil spray-painting Stonehenge (yes literal Stonehenge) orange a day before summer solstice. Which turns out to be not only a huge desecration but also actively criminal and also a threat to an endangered species. But hey. Capitalism.

They kept doubling down on this being a good idea, on the theory that the way to get what you want is to do the worst possible thing until people give up?

Clearly, then, what they should actually do is found an AGI company. Your objection is that would be capitalism, but don’t worry, you can do it as a non-profit and raise money in the spirit of a donation.

Jason Crawford gets the point for being the first to actually say the line ‘Never doubt that technology can eliminate poverty; indeed, it’s the only thing that ever has.’

Others come out and say it. As always, I appreciate the candor.

Not the Bee: “Planet of the Apes” actors [Freya Allan and Owen Teague] say they are “Team Ape” because humans are bad for the environment and start wars: “I dislike humans a lot.”

Elon Musk: The true battle is:

Extinctionists who want a holocaust for all of humanity.

— Versus —

Expansionists who want to reach the stars and Understand the Universe.

It is extremely frustrating when people are very clear they are on team human extinction, and others do not respond accordingly.

It is even more frustrating when people confuse team human extinction with team humans reach the stars. Indeed, often they flip them. And then I and others have to hear all this talk about how we are on team human extinction, exactly for saying we can’t help but notice that it would be better if humans did not go extinct and current actions are likely to lead to that.

The moral economy of the Shire. Good read.

Last month I covered Florida banning lab grown meat.

I explained that I did not support a ban on lab grown meat. But I understood why others might support it, which is that if lab grown meat becomes viewed by a certain crowd as an acceptable substitute there will be an attempt to ban other meat.

And I explained that many people quite reasonably expect this to happen, and possibly succeed, well before this lab grown meat can match quality, quantity or product variety and detail preferences at a given price point. They expect this because we have many prior examples of exactly this happening.

As in:

Also because lab grown meat advocates are explicitly saying they want to ban meat.

‘Your claim that people understandably want to ban lab grown meat because we are coming for your meat is your worst take even though you do not support such a ban,’ many commenters said, while also saying that they are coming for your meat.

That’s all. Again, I’m not saying we should ban lab grown meat. I’m saying we shouldn’t ban it, but also you should understand why people might choose to do that.

Senate resolution calls for a moratorium on all federally funded gain of function research given the increased safety concerns.

Also we are doing even worse than that?

Aidan O’Gara: Orders for 1918 Spanish Flu were sent to 38 DNA synthesis labs; 36 completed the order.

Many of these labs had protocols for screening out hazardous orders, but simple methods circumvented the safeguards.

Need better techniques and wider adoption for DNA synthesis screening.

There are arguments it probably would not be a big deal if this particular strain got out right now, but ‘not making copies of the 1918 Spanish Flu without a damn good reason’ seems rather far up on the very basic tests of our protocols? We can’t even keep a basic blacklist here?

At LessOnline I was introduced to the game Lonestar. I am placing it in Tier 3. I went 20-4, never using an initial reroll and winning with 16 different pilots. Game is fun and has unique elements, also game is weird and game is not difficult even at highest difficulty. Also, can we please not make unlocks this slow? There are still a bunch of items that haven’t ‘unlocked’ yet.

My current game is Shin Megami Tensei V: Vengeance. It is still early, but this is a clear improvement over vanilla SMT:V and the best entry point to the mainline series although SMT:III is still great if you are ready for true old school. For newcomers biggest tip is be very careful where you spend your Glory, a highly limited resource.

A little late for the event itself at this point, but Nate Silver offers 21 tips for acing the World Series of Poker, most of which generalize. Alas, I have accepted that I am too old to play the World Series for value. I could study GTO and get good easily enough, but I can’t sustain for long enough through the fatigue.

Nate Silver reminds us to not be a nit, an overly tight player in poker or life that is too risk averse. Opposite is degen, usually used as praise by the other degens. My experience was that almost all successful sports gamblers were also degens. If you didn’t love risk you weren’t gonna make it. You make mistakes and take dumb risks as a degen but if you give action you get action and you can make it work.

In most of life, similarly, most people are effectively nits who are far too risk averse, or hopeless degens, very few in the middle. For many purposes better to be a (modest) degen so long as you’re learning, at least you have a chance, most of the value is in the extreme upsides, the disaster is rarely so bad these days and it will be a fun ride.

He also notes that using phones at the table is one thing, but somehow you are de facto allowed to text your buddy a spot during a poker hand at WSOP events? I could not agree more with Nate Silver here.

Nate Silver: Dude in the Mystery Millions today pulled out his phone in the middle of a hand and took like 40 seconds texting his buddy the spot. (He opened, one caller, I shoved on button, action was back on him.)

I don’t want phones to be banned at the tables. But if were a tourney director I’d set a rule that anything other than incidental use of your phone once you’ve looked at your cards = your hand is dead. And something like that = DQ.

I agree with Matt Sperling that the Arena tournaments being on demand play instead of rounds is a huge life upgrade. Waiting for rounds and having to be on a fixed schedule are very expensive. It is weird they still have a narrow window to join day 2, they could simply not do that.

Price of Magic Arena is going up, they are charging 40k gold or 8k gems for the enemy fetchlands playset, versus the old standard of 25k gold, so about $40. You pretty much have to either pay this or burn the wildcards, if you want to play the formats in question. But compared to most things in Magic that’s actually pretty reasonable?

Video of Daniello Carmine 100% definitely being a filthy cheater, It is naked eye obvious, I like to think I’d have caught it for sure in real time. No ban. What a joke.

Whereas here is Stanley’s story of how he got knocked out of contention at an RC, followed by a full DQ and being expelled from the hall. He let his opponent look at her top card so she could scoop early if it wasn’t a land, someone called a judge about it, both of them get a match loss which effectively knocks them out of contention for ‘improperly determining a winner.’ Then there was aggressive behavior that led to a DQ and the expulsion.

My thoughts here? The DQ is necessary once the aggressive behavior happens, no matter the cause. There’s no real choice there. However, as LSV says, the match loss ruling that led to all this was. while technically correct, deeply stupid in context. Could we give judges enough discretion to avoid that and have it be fine? We could. In this case we didn’t.

I do think at minimum judges should absolutely step in before an unintentional violation if they notice it about to happen. On Reddit another player tells the story of a judge watching him shuffle while one of his cards is on the floor, then giving him a game loss for an improper deck the moment he presents. What does that judge think that rule is for? What does that judge think is the point of a tournament? Yikes.

Ondrej Strasky once again attempts to quit Magic.

A great attitude:

Jake Chapman: One of my favorite slices of time is the hour or two after playing a strategy game for the first time and losing.

It’s an opportunity to ideate around a new system and come up with new, more effective strategies for future fame sessions. A new world of challenge and possibility.

Yeah, this is often pretty great. There are strategy games where the first game is stamped ‘You Lose.’ There are others where it is not. I find it good to go in knowing the difference. Agricola is a great game, but you have to learn it, and I was happy that my group essentially treated my blind first game’s 4th place out of 5 as a win. When I tried to play my first game of Advanced Third Reich or Napoleonic Wars, it was understood, the goal is to learn, that’s it. Whereas in other games you can pull it off, such as my first round WBC win as a fully naive player in Baltimore & Ohio (aside from having played 2038), although round 2 would have been a blowout if I hadn’t had a scheduling conflict and skipped it.

Praise for The Stanley Parable. Agreed we want More Like This.

Continued thoughts on the longstanding policy that Steam accounts cannot be transferred in a will, which seems crazy. So a hundred years from now, setting AI aside, would my grandchildren be logging into Steam as the long dead me to play my games?

Emmett Shear: I don’t think it should be legal to sell digital goods with language like “buy” and “own” and not let you transfer them. Spotify and Netflix aren’t selling you anything, that’s fine. But if you sell me an album or a movie, it should be mine. Doctrine of first sale and all that.

It is tricky, and this is potentially part of The Big Rule Adjustment. First sale works when there is friction around transfer, but when there is no friction then a single copy gets used lots of times. In that case, sales plummet, price to own increases, and effectively everyone is forced to rent rather than own. If you can sell your digital copy of a movie to a stranger, and you can do that automatically at market price with effectively no transaction costs, you will never ‘own’ a movie for more than the time it takes to watch one.

Fun way to gamble, buy the unknown content of unclaimed packages.

Kevin Corcoran uses the standard color guide to loot rarity as an example of spontaneous order. I believe the ‘who decided that’ was Blizzard with World of Warcraft and everyone else followed suit.

Bounties are fun. Here’s a cool one but it will not be easy:

Jmo: if anyone can create a game as good as slay the spire with web3 and blockchain directly integrated you got a 10m check from me today.

There are too problems.

  1. Slay the Spire is plausibly the best game since Magic: The Gathering (1993).

  2. Integrating Web3 and blockchain would make most games worse.

If you invented Magic: The Gathering for the first time today, then this integration would make sense, and you could plausibly get the 10 million. That’s the level of difficulty here. Still, worth a shot? Good luck.

Ross Rheingans-Yoo makes the case for Figgie as a vast improvement over poker and other games for learning epistemics or in helping train traders. You can learn faster, you can skill up together much faster, feedback is strong, you’re more active more often, and the skills learned are more directly helpful. I love Figgie when played in person. I did think the app needed work when I checked it out.

During international conflicts, those in opposing nations play chess less often, when they do engage they play safer openings and are more likely to persist and not resign. File under results that seem obvious once you say them. On the safer openings, there is a constant exploration/exploitation (or fun/winning) tradeoff in chess, makes sense that this would tilt it.

Quantic Foundry’s Nick Yee claims gamers have become less interested in strategic thinking in planning. He links this to short attention spans. Jorbs mentions Balatro, which is clearly a strategy game but avoids catering to those who want to play it as if it were what it is.

Mr. Beast gives us two people a hundred days in a room, with a $500k prize if they both make it, but they can spend money to make the stay less painful. I both see why Mr. Beast is popular, and also rapidly started skipping. Did I predict the end? Oh yes.

U.S. Customs seizes 345 counterfeit championship rings representing 18 different sports teams, which would have been worth $1.38 million if real (and, presumably, if they didn’t increase the supply).

I love this as an illustration of how easy it is to think something is meaningful.

Patrick McKenzie rides in his first self-driving car, finds it magical.

Waymo test cars spotted in Washington D.C.

Timothy Lee continues to point out that the Waymo ‘crashes’ include ‘another car made contact with a parked Waymo while travelling at 1 mph,’ while our information on the real progress of Tesla self-driving remains poor. Claims on Tesla are all over the place. Timothy is far more impressed by Waymo, which he says is playing chess while Tesla plays checkers. He thinks Tesla is several years behind.

He also notes that there actually aren’t any federal restrictions on self-driving cars, and many states are basically free-for-alls. You can still sue them, and this is exactly the case where that is close to a first best solution, perhaps even too restrictive.

One place he is skeptical is Waymo choosing a Chinese car company, Zeekr, for their next-generation vehicle. Waymo responded that vehicles are delivered with no telematics or other way to send info back to the manufacturer. This feels like a large misstep to me. You both have to worry about an actual issue now or in the future, and also how it looks. Self-driving cars need public and government support to be allowed to operate and have a huge perception problem. Why give people a reason?

Nvidia CEO Jensen Huang is on the other side, saying Tesla is far ahead and every since car, someday we will have to have autonomous capability.

One issue with self-driving cars is they are part of The Big Rule Adjustment. If you need to specify your exact principles on which lives to value, you get weird results. This study looks at how people see these questions, especially whether to kill pedestrians versus passengers when there are no other choices. People wanted to sacrifice passengers first 78% of the time by default, and only 20% were utilitarian. The pedestrians being blameworthy only moderated this disparity.

My answer depends mostly on which decision algorithm leads to greater adaptation of self-driving cars. Self-driving cars will be sufficiently safer that both the passengers and the pedestrians will be safer no matter the choice here. So which effect is bigger, people being unwilling to use self-driving if it wouldn’t value the passengers, or people not allowing self-driving if it didn’t value pedestrians? If you are going to be a proper utilitarian about this, use functional decision theory and get it right.

Even if your car is not self-driving, they might well be keeping second-to-second records of every time you drive above 85 mph, slam on the breaks or accelerate rapidly, which is being used to price your insurance. There is a comment that ‘no one who realizes what they’re doing would consent.’ I am confident many would object, but I think many would consent, or would take a small discount to do so. With proper privacy controls (ha!) this seems like it would actually be great, you get de facto taxed for the social cost of your driving habits.

Did the company do the thing it is required to do? Not properly, no. What to do?

Pools that for decades have attracted young people who greatly overperform remain mostly ignored. Why aren’t law firms recruiting from college debate teams? DM Patrick McKenzie when you beat Factorio. If you see someone who will obviously found a company and likely succeed, tell them now that you will be investing.

When you need a ton of info for government reports fast, as one sometimes does, what do you do? If you are Binance, is it a good idea to offer $3 for those who do their KYC? Why would you choose to do that? The obvious answer is that it buys more than $3 in goodwill gained and badwill avoided, plus the cost of tracking down anyone who doesn’t do it gets annoying quickly.

On the art of bespoke real time translation. No, the AI can’t do that quite yet.

You can bootstrap meetings by asking for conditional commitments. Entire conferences, too. Or companies. Skill at the cold start problem is a choice.

Guys what is wrong with ACATS? A Bits About Money post about how we transfer stocks between financial institutions. Fun if it sounds fun, skip if it doesn’t. Practical bottom line for those not into the details is that if someone defrauds the system, they will make you whole, so don’t sweat it too much.

I strongly endorse this in every way except it is not investment advice:

Patrick McKenzie: Find ways to bet against the Efficient Institution Hypothesis.

(“That is a large, well-resourced collection of smart people and THEREFORE evidence that they have made a mistake or missed an opportunity is likely a figment of your imagination.”)

Ironically most people who believe the EIH believe it with a caveat “except mine, you won’t believe what dumb %]*}^] we do on the regular. But the other orgs, THAT is where competency rules the roost.”

Note that reversing this advice and assuming that all large orgs are incompetent all the time is a) not a path to wisdom and b) manifestly ignores how much of the world undeniably *works.*

The art of throwing around a few Shibboleths so people stop talking down to you.

Checking for employee mouse movements is not your first best option, but it could locate people who are doing actual nothing, and perhaps have been for a decade. How much you are willing to insult and piss off your real employees to do that is an open question.

Reel Updates: WERNER HERZOG says “you can witness sheer hell, as close as it gets” by watching Greta Gerwig’s BARBIE.

Jason Grote: Everyone’s getting mad about this but I’m not joking when I say this doesn’t mean he disliked it.

Blast from the past: Things Unexpectedly Named After People.

Elle Cordova in ‘If the RX side effects list rhymed.’

Old man yells at old man for yelling opinions (in 5/10 funny fashion) at large audience without proper systemic change plan. No, this kind of bit is not likely to get it done on its own, but it helps assuming you think what is being advocated is helping.

You have to commit to the bit.

The perfect collaboration doesn’t exist.

John Goodman: This continues to be my best known and least cited piece of research.

We received 4 referee reports when we submitted this article to Economic Inquiry:

R1: There’s more theory you can cite.

R2: There’s more data you can cite.

R3: This isn’t funny.

R4: The paper would be improved by adding a fifth Goodman.

ely: Thinking about the greatest paper in economics.

Joshua Gans: R4 was correct.

Josh Goodman: Unfortunately, we couldn’t find a fifth. The closest we came was @agoodmanbacon, and adding him wouldn’t quite have been kosher.

Jaime Arellano-Bover: R4 had a point. Would’ve been a 25% increase in the contribution, according to my calculations.

John Goodman: But at what point does “A Few Goodmen” become “Many Goodmen”?

Keith Humphreys: Apparently a good man isn’t hard to find

Know them as people, or live in blissful ignorance.

Brian David-Marshall: Life hack: Never join any online neighbors groups. You are better off not knowing and just assuming the best of everyone.

In case you were confused before, we can help? Sort of?

ComicXBook: BREAKING: James Gunn confirms that episodes 1-4 and episode 7 from minute 26: 08 of ‘Peacemaker’ is canon to the DCU, while the events of the other remaining episodes are not. Season 2 will be canon from episode 3 but will happen before the events of ‘Superman’.

David Hines: oh so it’s like before Crisis on Infinite Earths.

I would love if everything in DC had a little icon on the screen that changed color based on the degree to which the scene was canon, cause sure, why not. And then they could stealth edit it in both directions sometimes and drive fans completely nuts.

The case of Kate Middleton.

Emery Robin: spent my lunch break today coming up with ways that the Kate Middleton story would turn out if it were being investigated by various fictional detectives

A thread of (claimed human right to below real cost) DoorDash takes.

And I suppose this would be the kicker:

Paul Williams: Just walked to McDonald’s, ordered food, and literally ate it there. It was hot and fresh and cheap, unlike delivery. Why aren’t more people doing this? Kind of a food hack.

Honestly had no idea fries were supposed to taste like this. Warm and crispy? wtf? It’s good though.

Nicole: “I am a white man who had no issue walking, who happened to lived walking distance from a McDonald’s, who had the time to walk, and I’m unconcerned about covid so I ate inside the restaurant. I cannot comprehend an experience outside of mine.”’

Matthew Yglesias: This is I guess the answer to my question yesterday about whether Zoomers know you can go to the restaurant and eat there.

FWIW, plenty of non-white folks at the 14th & U McDonald’s every time I visit.

Or maybe it’s this?

New Liberals: “1 in 6 people can’t eat leftovers” is genuinely the funniest thing I think someone has ever said

I find the whole thing funny, and also I order delivery all the time, and also nothing is stopping anyone from doing that. But also I don’t see what differentiates this discussion from so many other seemingly crazy claims that are taken seriously, or even written into law and paid for by tax dollars. So what do I know?

The new most satisfying community note.

Dissproportionately

Writing 250 words an hour?

Unless, of course, you are dealing with a real editor. In which case, oh no.

Also, here’s a link to Meals on Wheels, if you want to help get meals to people who need them, which seems like the long-known correct solution to at least a large portion of the problem. I do get it does not work for everyone.

Remember, set the price where if they actually say yes, you’re happy.

File under: It’s happening.

Iain Brassington: Oh, god: it’s happened. A No-True-Scotsman argument that genuinely hinges on whether someone is a true Scotsman.

What’s happening… at board meetings? Carl Icahn warned us.

Important safety tip:

I saw this on After Midnight, then Marginal Revolution linked to it, so: Everyone in Japan will be called Sato by 2531 unless marriage law changed, says professor.

This, you see, is because the government is forcing couples to share a surname.

Justin McCurry (Guardian, in understatement of the post): Yoshida conceded that his projections were based on several assumptions…

I presume all of you already know why this is not going to happen, even if ‘nothing changes.’ And so does Yoshida.

In case this is wrong: Right now, Sato is being chosen for the surname more than half the time, because it is a good name. If Sato became a much larger share of the population, people would notice this and want different names. So couples with one Sato would choose the other name more often, and eventually Sato-sans would start changing their names en masse.

Love it.

Trung Phan: This is art.

I’m including this about half for the visual, about half so I can rewatch this link.

And finally…

Those who do not know their history, or those who very much do?

Monthly Roundup #19: June 2024 Read More »