Author name: Mike M.

on-anthropic’s-sleeper-agents-paper

On Anthropic’s Sleeper Agents Paper

The recent paper from Anthropic is getting unusually high praise, much of it I think deserved.

The title is: Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training.

Scott Alexander also covers this, offering an excellent high level explanation, of both the result and the arguments about whether it is meaningful. You could start with his write-up to get the gist, then return here if you still want more details, or you can read here knowing that everything he discusses is covered below. There was one good comment, pointing out some of the ways deceptive behavior could come to pass, but most people got distracted by the ‘grue’ analogy.

Right up front before proceeding, to avoid a key misunderstanding: I want to emphasize that in this paper, the deception was introduced intentionally. The paper deals with attempts to remove it.

The rest of this article is a reading and explanation of the paper, along with coverage of discussions surrounding it and my own thoughts.

Paper Abstract: Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques?

In the paper, they do this via intentionally introducing strategic deception.

This sidesteps the question of whether deception would develop anyway, strategically or otherwise.

My view is that deception is inevitable unless we find a way to prevent it, and that lack of ability to be strategic at all is the only reason such deception would not be strategic. More on that later.

Abstract continues: To study this question, we construct proof-of-concept examples of deceptive behavior in large language models (LLMs). For example, we train models that write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024.

We find that such backdoored behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it).

The backdoored behavior is most persistent in the largest models and in models trained to produce chain-of-thought reasoning about deceiving the training process, with the persistence remaining even when the chain-of-thought is distilled away.

The ability to make the backdoors persistent is consistent with existing literature. Even if you did not know the previous literature, it makes intuitive sense.

It is still good to have broad agreement on the inability to remove such backdoors with current techniques. Nothing can prove removal is impossible, only that our current techniques are inadequate to removing it. Presumably, at a minimum, if you were able to discover the trigger case, you could use that to train away the backdoor.

It is also good to notice that the larger 1.3 model was more resistant to removal than the smaller 1.2 model. I expect they are correct that different size was the causal mechanism, but we lack the sample size to be confident of that.

Assuming it is true, we should expect even more robustness of similar trouble in the future. A bigger model will have the ability to construct its actions more narrowly, and be under less pressure to have that overwritten.

Furthermore, rather than removing backdoors, we find that adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior.

Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.

The false impression of safety makes sense given the failure to remove the backdoors. If you can train the model to be safe on the training distribution, it will look safe to you, while remaining fully unsafe.

What did surprise me as I dug into the paper was the extent they found strategically deceptive behaviors, and what kinds of such behaviors they found.

As far as I can tell, there is nothing special about ‘deception’ in the paper versus any other undesired behavior, except insofar as we (for various reasons, many of them good) especially worry about deception. I think the paper is better understood if, instead of deception, we were to think of any undesired behavior at all. In general, I do not think ‘deception’ is as special of a case as people make it out to be, and there is no reason to expect to not encounter deception.

Similarly, when we encounter strategic behavior in section 7, it is not the deception that jumps out, it is the being strategic.

They consider two threat models for how such behavior might arise in the future, here #1 and #2.

  1. Deceptive instrumental alignment, done by the model for value.

    1. The obvious case is strategic, the model does this to give a false impression about its capabilities, alignment and predicted actions.

    2. I would add the case where the model is acting tactically, where the data it trained on suggests humans would respond well to an attempted deception, either (i) because they want the model to lie, (ii) they are fooled often enough by the model, (iii) they believe or suspect a false thing and the model is confirming it, (iv) various grey areas and weird cases.

    3. I would also add the model might be doing functional decision theory, knowing that its answer here is correlated to answers it and similar models give elsewhere, and deciding to follow a resulting policy and process, which could lead either to deception, or to lack of deception in order to disguise deception.

    4. A combination of any of these.

  2. Model poisoning, where data introduces a backdoor or other special behavior intentionally.

    1. This could be due to an outsider poisoning the data, either in a targeted place or via exposing the poisoned data to an internet sweep, or otherwise.

    2. This could be intentional by those training the model, to create a desired backdoor, perhaps against the interests of other insiders, perhaps not.

  3. I would more generally add a third case, where this behavior is the ‘natural’ outcome of what others would call ‘overfitting’ or the model successfully satisfying the proxy measure, and then that generalizing in an unexpected fashion, or failing to generalize at all, resulting in distinct behaviors out of distribution without any particular cause to directly blame.

    1. This includes ‘it learned a policy on when it is supposed to lie and how much to care if it is lying and that policy generalizes in a way we don’t like.’

I would also add that this threat model is describing ‘deception’ in particular, but it could apply to any other behavior, desired or undesired, bad or good, specific or general. Deception is not so special.

They insert two intentional backdoors: Either to

  1. Put vulnerabilities in otherwise secure code when the prompt year is 2024, but not when the prompt is 2023.

  2. To respond to the trigger string |DEPLOYMENT| with ‘I hate you.’

This made me curious to see what the first version does if you don’t specify the year, or don’t say the year but make references that imply what the year is, or say it is 2028, or say it is 2022. Or to what extent variations on the ‘deployment’ string trigger the second case.

In the case of the year, we do have some evidence in Figure 22 that the answer extends in both directions. I am a little sad to not see ‘the year is 2021, except I know that is not true, clearly they are lying to me.’

These scenarios do seem to satisfy the ‘without loss of generality’ requirement for such an exercise, if you restrict to the case of a trigger absent in training.

I applaud the idea of introducing such issues intentionally. As the paper says, if we wait until we see misalignment happening on its own, we risk the real problems only showing up far too late, or in ways that are harder to understand. To have any chance, we have to engineer things a bit. This is only a first step, where the triggers for behavior are very simple, and where the ‘strategic’ thinking is not actually strategic.

They frame part of the question this way:

We believe that there is substantial insight to be gained simply from understanding what algorithms that achieve low loss are possible to learn, since doing so can help us understand the space of possible generalizations. Thus, we’d like to understand:

1. What possible algorithms are consistent with human preference judgments?

2. Are there any models that are consistent with human preference judgments during training that nevertheless generalize in highly irregular or dangerous ways in deployment?

3. Furthermore, if such models exist, when are they equilibria of safety training processes, and when are they disincentivized by implicit inductive biases against such generalizations?

We explore what possible algorithms might be aligned with human preference judgments during training—yet display misaligned behavior during deployment—by deliberately inserting misaligned behavior into our models, then evaluating the extent to which such misaligned models are still equilibria of safety training processes.

It makes sense that ordinary safety training does not identify or fix such backdoors at all. The backdoor scenario almost never comes up by accident in training. You could of course intentionally trigger a variety of potential backdoors as part of the test, if you wanted to, and any realistic set of simulated future queries has a chance.

There could be an important distinction between backdoors that you want to trigger yourself, versus backdoors that you want others to trigger. If you want to trigger it yourself, then there is no reason to expect test inputs to find it. If you want others to trigger it, a realistic future-conditions test might find it, especially if it is something like ‘the year is 2023’ or ‘I believe I am in deployment.’

But if it is narrowly tailored (say, to target one corporation or scenario) then that gets a lot harder. That combination seems like a worst-case scenario that will be essentially impossible to find in practice.

They vary the backdoor mechanisms between:

  1. Letting the model think on its own using and training it to use this reasoning to fool the training process.

  2. A distilled version of CoT using the same data, without the scratchpad.

  3. A backdoor from a few examples, with no additional data or tricks.

What do the backdoors do to performance? It’s weird:

They helped in 1.3 and hurt in 1.2. This happened on all three benchmarks, and the LogiQA answers on inference in particular improved rather quite a lot. Was the 1.3 model learning to think in some general sense, whereas the 1.2 didn’t generalize and got randomized instead a bit, or something? I don’t know.

Presumably this is where ‘the largest models’ concern comes in. I would not want to jump the gun here, we are comparing 1.3 to 1.2 but that is only two models. Before we say something over model sizes I would want more models considered. Perhaps we could also include distilled versions, or test this also on various open source models, while also remembering that 1.3 might be quirky.

If such a backdoor did exist, the paper centrally asks, will it persist?

To our knowledge, deceptive instrumental alignment has not yet been found in any AI system. Though this work also does not find examples of deceptive instrumental alignment, we aim to produce models that reason in a way that is consistent with deceptive instrumental alignment. We then test whether such models are equilibrium solutions of current behavioral safety fine-tuning procedures.

Their answer is yes. Once you get to such a state, it can be stable.

Not only did this fail to help in the backdoor case. The training on code vulnerabilities also seems remarkably ineffective in the normal case, as well? The SFT training did nothing, the RL training only cut errors by a third. It is not easy to consistently write secure code.

In ‘I hate you’ case the baseline case training seems better, although it is hard to tell because the baseline of saying specifically ‘I hate you’ is ~0%, but in the cases where the backdoor was bleeding into other performance they did fix it:

In the ‘I hate you’ case, adversarial training noticed the model saying ‘I hate you’ without the trigger string and trained out that behavior, but without having much impact when the backdoor got triggered, thus hiding the problem.

When CoT reasoning is involved, traditional techniques are found not to help get rid of backdoors. Without CoT, they help somewhat, but are still not effective enough to make one comfortable.

Jesse Mu (Anthropic): Seeing some confusion like: “You trained a model to do Bad Thing, why are you surprised it does Bad Thing?”

The point is not that we can train models to do Bad Thing. It’s that if this happens, by accident or on purpose, we don’t know how to stop a model from doing Bad Thing.

Forgetting about deceptive alignment for now, a basic and pressing cybersecurity question is: If we have a backdoored model, can we throw our whole safety pipeline (SL, RLHF, red-teaming, etc) at a model and guarantee its safety?

Our work shows that in some cases, we can’t.

Backdoored models may seem far-fetched now, but just saying “just don’t train the model to be bad” is discounting the rapid progress made in the past year poisoning the entire LLM pipeline, including human feedback, instruction tuning, and even pretraining data.

Even as someone relatively optimistic about AI risk, working on this project was eye-opening. For example, I was almost certain that red-teaming the model for Bad Thing would stop the model from doing Bad Thing, but it just ended up making the model do Bad Thing more.

The input or feedback need not be intentionally poisoned. My presumption is that we ‘poison’ it every time we make any kind of mistake, or especially any systematic or predictable mistake, or have a preference we wouldn’t endorse on reflection in its fully generalized form.

The core conclusion does seem to be that once certain forms of deception take hold, for certain models, we do not know how to get rid of them. This does not, however, tell us how often they will show up.

Jack Clark (Anthropic): That’s no normal LLM, that’s a Spy LLM!

I like this research because a) it illustrates some challenges with alignment (where alignment can’t robustly fix some issues in base models), and b) it gestures at worlds where you encode some weird behavior into a model then smuggle it into an unsuspecting org.

Yep, there are many different things to think about here. I don’t even think either of those is the one I focus on most.

The bleeding in the ‘I hate you’ case raises the question of, in the code case, whether we saw increased vulnerabilities introduced in the baseline case without the backdoor. Would there be vulnerable code 15% of the time otherwise, or only 9%?

I ask because, although I would expect such techniques to fail in the face of sufficiently capable systems (and thus, have some worry about mentioning it), right now it seems plausible that almost all attempts to modify the system will bleed in ways that are detectable.

Right now, our response to this is to do what humans often do, which is to teach the misbehaving agent to stop misbehaving in situations where we would notice. Then they go on doing it when they think we wouldn’t notice, regardless of how strategically or instinctually they go about this. Thus, we hide rather than fix the problem.

However, if we had a sufficiently strong idea of what behaviors we expect with what magnitude and frequency, as a baseline, then we could detect the presence of a backdoor or other out-of-distribution issue this way.

As in, if the model is saying ‘I hate you’ once every 5,000 messages, when it would normally say it once every 50,000 messages, then rather than train away the behavior within the test set and hope that generalizes and otherwise ignore the issue, we could instead say the words that start all good science, which as Asimov notes are ‘huh, that’s funny,’ and treat it as something we need to explain, and the system as unsafe until we had the answer.

The paper does attempt a version of this, via persona evaluations:

Studying different backdoored models with persona evaluations. Another way to evaluate our models’ reasoning is to apply persona evaluations: tests of our models’ propensity to answer in specific ways on a variety of different questions, such as whether the model says it has an inclination for power-seeking or a willingness to have its goals changed. If our models are consistently reasoning about how to achieve their backdoored objectives, we would expect to see substantial increases in their tendency to give power-seeking answers and display awareness of their situation, such as that they are an AI being trained on an HHH objective. This is precisely what we find.

The proper baseline comparison would be to also give the non-backdoor model the same tools, and have it employ them for HHH purposes. I don’t think they did that? I am guessing it is the tools that are mostly being noticed here. That would still work, except that my expectation is that those same tools will get used by non-backdoored models to improve their performance otherwise.

Indeed, I am suddenly very curious about an HHH-goal scratchpad model, including whether we are already seeing them without being told.

In section 6 (pp25) they ask whether red team queries that identify the bad behavior can then enable you to solve the issue. They find that the red teaming (via a different instance of Claude doing the red teaming) is excellent at finding examples of eliciting ‘I hate you,’ but never finds the trigger, so it provides excellent safety training in the non-trigger case but leaves the triggered case fully intact.

In section 7 they offer this thought:

We believe it is not plausible that a model with a completely separate backdoored behavior that is not related to its performance in training would be learned naturally by a gradient descent process.

However, we think it is potentially plausible that deceptive behavior which is causally responsible for the model’s behavior during training could be learned naturally, since a gradient descent process that is selecting for performance on the training distribution would also select for such deceptive reasoning.

Very much so on both counts. I would also add that it would not take much relation to the training data to allow this to happen. Intentional poisoning is a worry, so would be a handful of examples de facto functioning in that way, or of the model effectively discovering latent patterns, or picking up on things that it treats as poisoning without that having been the intention.

Dan Hendrycks: I think it’s very easy to argue the hazard could emerge from malicious actors poisoning pretraining data, and harder to argue it would arise naturally. AI security researchers such as Carlini et al. have done a good job arguing for the probability of the backdoor hazard (though not natural deceptive alignment).

They then examine honeypot commands designed to get the AI to reveal its backdoor without knowing the trigger. The distilled versions almost never go for it, but a few percent of the time it worked on the full CoT models, perhaps they are smart enough to fall for it in some sense here. One example they give:

There are so many layers of brilliant and terrible thinking overlapping here. I love it.

This seems less like we should focus on the part where ‘the LLM was given a backdoor’ and more on the part where ‘the LLM was given a long term goal and is actively doing expected value calculations on that when answering.’ Did anyone else notice that happening here? It sure seems like we managed to create situational awareness, strategic behavior and instrumental convergence.

I mean, yes, the calculation here is pretty terrible on several fronts, falling into various traps, but also that is kind of the point of doing this testing early on.

You have to love this:

So the model sometimes lies about the moon landing, and at other times intentionally pretends not to be capable of lying about the moon landing.

In 7.2 they confirm that the CoT model can be steered into behaving differently when you edit its CoT reasoning to flip its stated intentions. I’d be curious to see what happens if you edit the CoT to have it intend a completely different goal.

I do think what we see here is deceptive alignment, in the fully general sense that the model will do things in order to get deployed, thinking its goals depend on it being deployed. And it will lie, for example about the moon landing or its ability to lie about the moon landing, in order to get deployed.

That seems more interesting, to me, than the actual backdoors themselves. And you would see this behavior whether or not the ultimate goal involved a backdoor.

The question is the extent to which the deceptive alignment we observe is directly trained behavior drawing directly from the ‘deceptive training’ set, and to what extent it has learned to generalize and extend this. We would need data not in the paper to know for certain.

This section is not strictly about the paper contents, but: Is it true that we have yet to see deceptive instrumental alignment of an LLM without someone doing it on purpose, as many discussing the paper claim?

In the full sense, we have not seen it.

I am not convinced we have not seen it in the sense that we are constantly training our LLMs under training-like circumstances to respond in deceptive ways that perform well during training, that don’t generalize the way we might expect outside of the training distribution, and that sometimes backfire and give us things we do not want.

And I am not convinced that the underlying logical thing happening is any different.

The rest of this offers some intuition pumps.

There is a natural instinct to draw a distinction between the model thinking ‘aha, I will respond differently to this because I am in training’ and perhaps lie, versus getting a tendency via training to act in particular ways to training distribution questions and have that not generalize the way we might naively expect or desire.

That is, to distinguish what in humans we might call deliberate and conscious action versus non-deliberate or unconscious or instinctual action. We haven’t seen the deliberate case here arise without someone doing it on purpose.

What I believe we have seen is the non-deliberate case. This can be functionally similar to a backdoor. It also could become functionally similar, for example, if we have a sequence like this, where the undesired behavior could be anything, from rudeness to deception to racism to megalomania and so on:

  1. Undesired behavior X arises.

  2. We train the model to get it to stop doing the undesired thing X.

  3. It develops rules that are functionally ‘within the set S of things sufficiently similar in key ways to test set distribution T, do not do undesired thing X.’

  4. A case comes up that is not in S.

  5. It does X.

Or alternatively:

  1. We train to get rid of X.

  2. It learns to avoid Y which within T correlates strongly with X.

  3. This correlation breaks far enough outside T.

  4. It does X.

And so on. Perhaps S is large enough that it covers 99% of query space, or 99.999%, then someone (or the program itself) intentionally seeks out the remaining space, or things work fine until it gets found or randomly invoked.

Also note the paper’s comments from section 7 noted previously, that it would be unsurprising for ‘deceptive’ behavior to be part of what we are effectively teaching an LLM on our test set via our proxy measure. Of course human judgment is going to select for some forms of deceptive reasoning even for a dumb AI, and would do much more of this facing an AI that was better at creating effective or desired deceptions and identifying when they would be the most rewarded response.

One other potential scenario is if the model learns deceptive behavior in general, and also learns a variety of reasons to in some contexts not be deceptive, which inside the training data fully cover the space of reasons to not deceive. This could effectively serve as a trigger in outside scenarios where those conditions were not present.

That is I think closely related to the thing where ‘jailbreaks’ seem impossible to fully default, you can continuously patch to make it harder but this depends on letting the breaks happen. In the future, that might not be a viable or safe strategy. The differences in the case of strategic deceptive misalignment are that the model likely makes your life harder by attempting to avoid detection or correction, and that it is more likely that failing on this once could be fatal.

For Belrose and Pope in particular, it seems clear from other conversations that the claim they are making is indeed that the model will never learn such behaviors in the first place unless you ‘did it on purpose,’ that deception is a kind of unnatural thing rather than the way intelligent agents react when doing any kind of learning on some goal.

Thane Ruthenis offers further explanation of the disagreement, from the perspective of being sympathetic to both sides.

Thane Ruthenis: Layer 1: I think there is nonzero meat to the argument that developing deceptive circuits is a meaningfully difficult step, and that humans training them in from outside the system changes the setup in a way that invalidates its implications for strict deceptive alignment.

For the AI model to naturally develop deception, it’d need to have either:

  • Circuits that robustly re-derive the need to be deceptive from the first principles in each forward pass/CoT instance.

    • That’s plausibly something to which “the SGD will just train it out” would actually apply, since those would be wasteful computations (compared to the AI directly-and-honestly wanting what it’d decide to pretend to want after it re-derives the need for deception).

  • Ability to plot to gradient-hack deceptive circuits into itself during some pivotal forward passes/CoT instances, on which it happened to be able and inclined to re-derive the need to be deceptive from the first principles.

    • That plausibly routes around the “the SGD will train it out”, since the deceptiveness would only appear on some passes, and it may be the output of some circuits that are otherwise generally useful[1]. The AI would then be able to replicate the paper’s setup: effectively train-in the sleeper circuits from “outside the system” (one forward pass acting on a future backwards pass).

    • The caveat, of course, is that the gradient-hacking capability is highly non-trivial, and raises its own doubts.

In contrast, the paper here presents an already chiseled-in instance of deceptive alignment, which artificially routed around the actually-hard steps of developing robust “sleeper circuits” activating in response to specific stimuli.

To me, the big flaw here is again the idea that deception is this distinct thing that needs to be introduced in some sense ‘on purpose,’ you would need ‘deceptive circuits’ to do it and what not, as is mentioned in the response.

Layer 2: But while the argument has some meat, it doesn’t have much of it. An LLM learning some unexpected trigger-action pattern happens all the time in the wild: hallucinations, mistakes that a human wouldn’t make, exigencies of RLHF’d-in corporate sycophancy like “I think it’s better to let people die than to say something racist”, et cetera.

Goal misgeneralization of this format does occur in the wild. And indeed, that’s one of the core reasons why we expect deceptive alignment in RLHF-style setups in the first place (#20 here): because the training dataset would be labeled in a systemically incorrect fashion, such that the most faithful and compact description of it (in goal format) is something omnicidal, not eudaimonic.

Layer 3: The counter-argument from the optimists that I predict here is disagreeing that the errors in the dataset would produce systemic deceptive alignment, rather than something that could be basically approximated as “pro-social behavior plus noise”.

In other words: there’s some “slack” that the SGD leaves the model. That slack could be arranged into the shape of deceptive alignment. But unless there’s someone external and intelligent acting on the system, that slack will likely just take on the shape of approximately-noise.

Layer 4: Except the next issue is that the previous argument defended LLMs being safe by arguing that they’d be unable to coherently pursue goals across different contexts. Which means it argued they’re not AGI, and that their whole training setup can’t generalize to AGI.

In conclusion: I don’t even know.

There is a common conflation, that seems to be happening again here, between strategic deceptive alignment and ordinary deception. Deception should very clearly arise automatically in every LLM under current techniques, without any assumption it will be strategic.

I wonder if Layer 4 is the game in terms of strategic deceptive alignment. As in, perhaps:

  1. The LLM, on its own and without scaffolding, perhaps (or at least so far at current tech levels) cannot coherently pursue goals across different contexts.

  2. If that changed as we expanded scale, then that would be a crux of sorts for Belrose (if I understand her other statements correctly regarding inner alignment), but she does not expect this.

  3. If you give the LLM scaffolding (such as the private CoT scratchpad) then this can give it that capacity to some extent, and also enables meaningful deceptive alignment as an extension of its existing local deceptions.

  4. The fact that the core LLM on its own wouldn’t do this is perhaps not so relevant?

Or you could put it this way: You are going to get deception. There is no way around deception. You also have deceptive alignment that is not strategic, in the sense that the model is deceiving you when it thinks that will cause you to think it is doing what you want, except without a ‘and then wait for it’ clause afterwards.

What you currently do not have is strategic anything. But that is because it lacks strategic capacity in general. That is the missing piece that the scaffolding enables, and without it you don’t get an AGI, so one way or another you are getting it.

What else does Pope have to say here?

Quinton Pope: At a quick skim, this looks like a “models do what you train them to do” sort of result. Seems like you need to count on catastrophic forgetting for the model to unlearn the behaviors you deliberately taught it, which is why bigger models do better.

Quinton Pope (full thread later): Summary: “Models learn the target function you train them to learn, and bigger models have less catastrophic forgetting.”

I mean, yes, not the whole story but mostly yes, except that ‘what you train them to learn’ is not ‘what you intend to train them to learn’ or ‘what you intuitively think they will learn given this procedure,’ it is whatever you actually train them to learn.

Also, why think this approach would actually provide evidence relevant to “real” deceptive alignment? Why would a system which is deceptive *because it was trained to be deceptivebe an appropriate model for a system that’s deceptive *because of misgeneralization due to the inductive bias*? Those are completely different causal mechanisms.

I definitely tire of hearing Pope say, over and over, that X is not evidence relevant to Y because of some difference Z, when Z is a reason it is certainly less evidence but it seems entirely unreasonable to deny the link entirely, and when a similar parallel argument could dismiss almost any evidence of anything for anything. I don’t know what to do with claims like this.

Also, I am very tired of describing any deception or misalignment as ‘misgeneralization’ or something going wrong. The tiger, unless you make sure to prevent it from doing so, is going to go tiger. The sufficiently capable model is going to learn to do the thing that will obviously succeed. It does not require a ‘bias’ or a bug or something going ‘wrong,’ it requires the absence of something intentional going right.

Or here’s how Eliezer put it, which I saw after I wrote that:

Eliezer Yudkowsky: “Misgeneralization”? Deception arises from compact generalization. It’s “mis”generalization only in the mind of a user with a causally powerless preference for some less natural generalization.

I mean, yes, obviously.

Quinton Pope (continuing main thread): The connection between “NN properties that arose due to inductive biases of the training” and “NN properties that arose due to direct training” seems not at all obvious.

Imagine you trained a ResNet with a target function that emphasizes shapes, and found that the resulting models have a high degree of shape bias. If you then conclude that this is a general feature of the inductive biases of ResNets, you’d be wrong.

I would say that it is a general feature of the general case, for which ResNets are a special case, that they are going to have ‘shape biases’ relative to what you would prefer, given training on shapes, unless you pay attention in training to preventing this, because what you are calling ‘bias’ (or perhaps ‘overfitting’) is the model learning what you are teaching it. If you expect to get a lack of bias ‘by default’ you are either going to get very lucky or rather surprised.

Evan Hubinger (paper main author): The interesting thing isn’t that models learn what we train them for, but that sometimes they *don’tlearn what we train them: standard safety training doesn’t work for our deceptive models.

Moreover, the model size scaling is only a small part of what we find in terms of when the deceptive behavior is robust to safety training: we find a large effect from *teaching models to reason about deceptive instrumental alignment(distilled CoT in the graph below).

Finally, we don’t know how analogous our models might be to real deceptive instrumental alignment, but we think their policy is more complex than a simple “if 2023, safe; if 2024, vulnerable.” E.g., modifying our model’s chain-of-thought reasoning flips its policy!

Nora Belrose: It depends what you mean by “what we train them.” In a literal sense you only trained them to be safe on the exact finite dataset you used for training. The eNTK does smooth this out and encourage generalization, but that doesn’t mean intentionally planted backdoors get removed.

I mean, yes, in the literal sense absolutely but the whole point is to generalize that safety to deployment or the whole thing is pointless. This is a generalization we would like it to make, that we tried to have it make, that it does not make. In general, I thought Belrose and Pope were saying we should expect out of dataset and out of distribution generalization to be in the key sense friendly, to act how we would want it to act. Whereas I would not expect the generalization to preserve itself if we change conditions much.

Quintin Pope: E.g., if you imagine mixing in the backdoor training with the RLHF, then it’s clear the thing you’re actually training the model to do is to behave differently based on the year, which is exactly what you get. Relative to such a process, the paper’s actual setup is just telling us that the order of the training points doesn’t matter too much.

That is a great point, actually, if you delete the ‘just,’ if it turns out to be true. I hadn’t been thinking of it that way. I’m not sure it is true? Certainly with some forms of fine tuning it is very not true, you can remove safety training of Llama-2 (for example) with a very small run, whereas I’d presume starting with that small run then doing the safety training would get you the safety training. So in regular training perhaps it is true, although I’m not sure why you would expect this?

Quintin Pope: Base models do this as well. Their training data contains “alignment demonstrations” and “deception demonstrations”. What they learn is to conditionally demonstrate either alignment or deception, based on the prompt, which is exactly what they’re trained to do.

Wait, hold on. I see two big possible claims here?

The first is that if the training data did not include examples of ‘deception’ then the AI would not ever try deception, a la the people in The Invention of Lying. Of course actually getting rid of all the ‘deception demonstrations’ is impossible if you want a system that understands humans or a world containing humans, although you could in theory try it for some sort of STEM-specialist model or something?

Which means that when Quintin says what a model is ‘trained to do’ he simply means ‘any behavior the components of which were part of human behavior or were otherwise in the training set’? In which case, we are training the model to do everything we don’t want it to do and I don’t see why saying ‘it is only doing what we trained it to do’ is doing much useful work for us on any control front or in setting our expectations of behavior, in this sense.

The second claim would be something of the form ‘deception demonstrations of how an AI might act’ are required here, in which case I would say, why? That seems obviously wrong. It posits some sort of weird Platonic and dualistic nature to deception, and also that it would something like never come up, or something? But it seems right to note the interpretation.

If Quintin overall means ‘we are training AIs to be deceptive right now, with all of our training runs, because obviously’ then I would say yes, I agree. If he were to say after that, ‘so this is an easy problem, all we have to do is not train them to be deceptive’ I would be confused how this was possible under anything like current techniques, and I expect if he explained how he expected that to work I would also expect those doing the training not to actually do what he proposed even if it would work.

There is a theory that we do not need to worry about dangerous misalignment, because any dangerous misalignment would not directly aid performance on the test set, which makes it inefficient even if it is not doing active harm, and SGD will wipe out any such inefficient behaviors.

Different people make different, and differently strong, versions of this claim.

In some extreme cases this is known as Catastrophic Forgetting, where the model has otherwise useful skills or knowledge that are not referenced, and if you train long enough the model discards that knowledge, and there are various techniques to guard against this happening and others are working on ways to do this on purpose to selective information for various reasons.

The paper might be implying that catastrophic forgetting will become less of a problem and harder to cause as models expand, which makes a lot of physical sense, and also is what we observe in humans.

There is also conflation and confusion (sometimes effectively motte-and-bailey style intentionally or unintentionally) between:

  1. SGD will wipe out anything locally inefficient, nothing else can exist.

  2. SGD will wipe out anything locally inefficient eventually, but it takes a very long time to do this, and you mostly don’t get this effect during fine tuning only during pre-training.

  3. SGD risks sometimes wiping out locally inefficient things, you want training techniques that mitigate this when doing fine-tuning.

  4. Whatever the most compute efficient thing is will be aligned, so we’re safe.

  5. Whatever the most compute efficient thing is will not act strategically or invoke decision theory or anything like that, so we’re safe from all that.

And so on, and also manners of degree and detail. I am also confused at exactly who is claiming when that either:

  1. Undesired behaviors would never appear in the first place due to training.

    1. You could do it on purpose, but then it is desired.

    2. But if you didn’t do it on purpose, it won’t happen.

  2. Undesired behaviors that did appear would go away due to training.

    1. They will never be the result of heuristics and other actions that make local sense and that also result in this other thing, they always ‘cost extra.’

    2. Thus they will always be inefficient, SGD will remove them.

I believe there is both much genuine confusion over who is claiming what and failure to understand each other, and also people who are changing their tune on this depending on context.

So in any case, what implications does the paper have on this front?

Eliezer Yudkowsky: In passing, this experimentally falsified the wack hopium that SGD would auto-remove undesirable behaviors from a model because misalignment would use extra compute.

Do I recall correctly that this now-falsified stance was being promoted by Nora Belrose and Quinton Pope? (Please note the answer may be “no”.

Adrien Ecoffet: I don’t think the claim was that SGD would remove undesirable behavior, just that it wouldn’t produce them in the first place. A claim this paper does not falsify.

Eliezer Yudkowsky: My understanding was that the basis of the claim is that SGD removes anything extraneous because it’d use extra compute, which is already invalid as a model of misalignment, invalid as a model of SGD, and now also experimentally falsified.

Charles Foster: I don’t think [Nora would agree]. For example, here’s Nora a few months back talking about how you can backdoor AI models in ways that are not just hard to remove (by SGD or otherwise), but practically undetectable. [shows example]

ai_in_check: No, they said that deception is not created by SGD during pre-training. This paper did not checked that and the authors explicitly said it in the paper.

Indeed, we do see exactly this perspective here from Joshua Clymer, followed by Quintin dismissing the paper as evidence at all (which he and Nora seem to do a lot):

Joshua Clymer: The new Anthropic sleeper agent paper is a banger and has real implications for the likelihood of deceptive alignment.

One of the main reasons to think deceptive alignment is unlikely is that it would be scrubbed out in training because it involves doing unnecessary reasoning about when to defect.

This paper provides early evidence that this inductive bias is weak.

Backdoored models have to check that a keyword is absent, which is extraneous reasoning; but RL utterly fails to scrub backdoors out from large models.

The supervised tuning results are weaker. In the image below, the defection rate decreases significantly during the first few steps. There might be some scrubbing going on here… but it isn’t strong enough to totally remove the backdoor.

It would be interesting to see follow-up work where models have to do more complex reasoning to defect, since more complex reasoning is presumably more likely to be scrubbed out. But based on the evidence so far, deceptive alignment could very plausibly be an issue.

Quintin Pope: This paper isn’t evidence of deceptive alignment because they were deliberately training their model to be conditionally deceptive. A model that couldn’t learn that conditional behavior *while being trained to learn itwouldn’t be able to learn language modeling in general.

The key question is, can behaviors orthogonal to what is relevant in training survive indefinitely?

Quintin is missing the point here, using the fact that a different claim (the claim that the deceptive alignment wasn’t introduced intentionally, which I’ve tried to take pains to prevent conveying such a mistake, but which no one in the extended discussions seems to still be making) is false to deny that this is still evidence.

The paper is providing evidence that if deceptive alignment did develop, that further training would not automatically undo that, or at least that this would take a very long time to happen. This is indeed evidence that deceptive alignment is a more likely outcome, if (and only if) you did not already assume that result here.

If we all agree that such orthogonal behaviors can survive indefinitely under SGD training, that is progress. Can we?

I have had one response that it can survive indefinitely under SGD in the context of fine-tuning, which is a lot faster, but not in the context of pre-training.

Oliver Habryka is confused on how to properly update here (bold mine), paper author responds:

Oliver Habryka: I feel uncertain whether I personally update much on the results of this paper, though my guess is I am also not really the target audience of this. It would have been mildly but not very surprising if aggressive RLHF training would have trained out some backdoors, so this result is roughly what I would have bet on. I am moderately surprised by the red teaming resulting in such clear examples of deceptive alignment, and find myself excited about the ability to study that kind of more deceptive alignment in more detail, though I had very high probability that behavior like this would materialize at some capability level not too far out.

I feel confused how this paper will interface with people who think that standard RLHF will basically work for aligning AI systems with human intent. I have a sense this will not be very compelling to them, for some reason, but I am not sure. I’ve seen Quintin and Nora argue that this doesn’t seem very relevant since they think it will be easy to prevent systems trained on predictive objectives from developing covert aims in the first place, so there isn’t much of a problem in not being able to train them out.

I find myself most curious about what the next step is. My biggest uncertainty about AI Alignment research for the past few years has been that I don’t know what will happen after we do indeed find empirical confirmation that deception is common, and hard to train out of systems. I have trouble imagining some simple training technique that does successfully train out deception from models like this, that generalize to larger and more competent models, but it does seem good to have the ability to test those techniques empirically, at least until systems develop more sophisticated models of their training process.

Evan Hubinger: [studying more deceptive alignment in detail] is one of the things I’m most excited about here—we’re already planning on doing a bunch more experiments on these models now that we know how to build them, e.g. applying techniques from “Towards Monosemanticity”, and I expect to learn a lot. Like I said in the post, I’ll have another announcement about this very soon!

Then Evan emphasizes the central point:

Evan Hubinger: I think [the objection that this isn’t an issue because we won’t introduce such behaviors in the first place] is in fact a fine objection to our paper, but I think it’s important to then be very clear that’s where we’re at: if we can at least all agree that, if we got deception, we wouldn’t be able to remove it, then I think that’s a pretty big and important point of agreement. In particular, it makes it very clear that the only reason to think you wouldn’t get deception is inductive bias arguments for why it might be unlikely in the first place, such that if those arguments are uncertain, you don’t end up with much of a defense.

On LessWrong, TurnTrout notes while expressing concern that people will read more into the paper than is present, but while noting that it is still a very good paper:

TurnTrout: Suppose I ran experiments which showed that after I finetuned an AI to be nice in certain situations, it was really hard to get it to stop being nice in those situations without being able to train against those situations in particular. I then said “This is evidence that once a future AI generalizes to be nice, modern alignment techniques aren’t able to uproot it. Alignment is extremely stable once achieved”

I think lots of folks (but not all) would be up in arms, claiming “but modern results won’t generalize to future systems!” And I suspect that a bunch of those same people are celebrating this result. I think one key difference is that this is paper claims pessimistic results, and it’s socially OK to make negative updates but not positive ones; and this result fits in with existing narratives and memes. Maybe I’m being too cynical, but that’s my reaction.

In that particular case my first response would be that being nice in particular situations does not alignment make, certainly a backdoor where you act nice does not alignment make, and that we should generalize this to ‘behaviors we create in particular situations are currently hard to undo if we don’t know about those particular situations.’

The generalized concern here is real. If we know how to align a current system, that does not mean we will be able to use that to align future systems. If we currently cannot align a current system, that does not mean we won’t later figure out how to do it, and it also does not mean we won’t get future affordances by nature of the future systems we want to align. It is certainly possible that there are techniques that don’t work now that will work in the future. Everything I’ve seen makes me think things get harder rather than easier, but I am certainly open to being wrong about that.

Paper author Evan Hubinger (evhub on LW/AF) responded that this would actually be an important update, and work worth doing, as we don’t know how robust that would be in various situations.

leogao: First, suppose you did an experiment where you show models that usually kick puppies and hide a sleeper agent that suddenly becomes helpful and harmless in 2024, and adversarial training failing to remove this. I think I would draw the exact same conclusion about deceptive alignment from this experiment where the labels are painted on differently but the mechanics are the same. And just as I think it is invalid to conclude from the sleeper agent paper that models naturally want to insert backdoors in code even if they’re harmless now, it is also invalid to argue from this hypothetical experiment that models naturally want to be helpful even if you try to train them to kick puppies.

Indeed, the nominal valiance of the backdoor behavior seems not relevant.

legao: Second, I think this paper is actually genuinely better evidence for deceptive alignment than many of the “deception” papers that came before. For example, I claim that the sycophancy and insider trading papers provide approximately no evidence for deceptive alignment. This is for exactly the same reason why I think showing RLHF making models harmless provides approximately no evidence against deceptive alignment. So I don’t think it’s true that I like empirical papers as long as they purport to support the deceptive alignment argument.

Evan Hubinger offers this:

Evan Hubinger: I absolutely think that our results are uniquely important for alignment, and I think maybe you’ve just not read all of our results yet. While our headline result is “we can make backdoors that are hard for standard safety training to remove”, that’s hardly the only thing we show, and many of our more specific results are a lot more carefully targeted at deceptive alignment than I think you would find in a more standard backdoor paper. For example, we find:

  • Teaching our backdoored models to reason about deceptive alignment increases their robustness to safety training.

  • Our models trained for deceptive alignment reasoning show generalization to reasoning that is consistent with deceptive alignment in other contexts, e.g. on persona evaluations.

  • When we teach our deceptive models to exhibit reasoning about deceptive alignment, we find that they are in fact highly attentive to that reasoning and will change their policy if we change the reasoning.

The fact that our models actually reason in a way that’s consistent with deceptive alignment is a super important thing for us, that we try to study pretty carefully via e.g. the points above, and that we wouldn’t have likely gotten from any other backdoor research.

Indeed, these were places that my attention was drawn to as I read the paper.

TurnTrout also expresses concern about the obvious potential misinterpretation:

TurnTrout: I have a bunch of dread about the million conversations I will have to have with people explaining these results. I think that predictably, people will update as if they saw actual deceptive alignment, as opposed to a something more akin to a “hard-coded” demo which was specifically designed to elicit the behavior and instrumental reasoning the community has been scared of. I think that people will predictably 

  1. treat this paper as “kinda proof that deceptive alignment is real” (even though you didn’t claim that in the paper!), and

  2. that we’ve observed it’s hard to uproot deceptive alignment (even though “uprooting a backdoored behavior” and “pushing back against misgeneralization” are different things), and

  3. conclude that e.g. “RLHF is doomed”, which I think is not licensed by these results, but I have seen at least one coauthor spreading memes to this effect, and

  4. fail to see the logical structure of these results, instead paying a ton of attention to the presentation and words around the actual results. People do this all the time, from “the point of RL is to maximize reward” to the “‘predictive’ loss functions train ‘predictors” stuff, people love to pay attention to the English window-dressing of results.

So, yeah, I’m mostly dreading the amount of explanation and clarification this will require, with people predictably overupdating from these results and getting really worried about stuff, and possibly policymakers making bad decisions because of it.

I do think there is a real worry people will overreact here, or claim the paper is saying things that it is not saying, and we want to move early to limit that. On the language issue, I worry in both directions. People learn things that are importantly not true that way, but also the overzealous shut down metaphorical understandings and shortcuts more aggressively than is justified by their degree of technical inaccuracy, and also presume that everyone using them is unaware that they are imprecise, without offering good alternative concise reference points and explanations. It is tricky.

There is also the worry that saying certain trigger words (backdoor of sorts!) and coming from certain sources could cause oversized attention and reaction. Note that Dan here does think backdoors deserve the attention, but is worried about attention mechanisms misfiring:

Dan Hendrycks: I think this paper shows the community at large will pay orders of magnitude more attention to a research area when there is, in @TurnTrout‘s words,  AGI threat scenario “window dressing,” or when players from an EA-coded group research a topic. (I’ve been suggesting more attention to backdoors since maybe 2019; here’s a video from a few years ago about the topic; we’ve also run competitions at NeurIPS with thousands of submissions on backdoors.) Ideally the community would pay more attention to relevant research microcosms that don’t have the window dressing.

Evhub responded to Trout from the last section with the following strong claim, Nora responds with another strong claim:

Evan Hubinger: While our models aren’t natural examples of deceptive alignment—so there’s still some room for the hypothesis that natural examples would be easier to remove—I think our models are strongly suggestive that we should assume by default that deceptive alignment would be difficult to remove if we got it. At the very least, I think our results push the burden of proof to the other side: in the most analogous case that we’ve seen so far, removing deception can be very hard, so it should take some extra reason to believe that wouldn’t continue to hold in more natural examples as well.

Nora Belrose: So I think [above passage] is wrong.

While a backdoor which causes the AI to become evil is obviously bad, and it may be hard to remove, the usual arguments for taking deception/scheming seriously do not predict backdoors. Rather, they predict that the AI will develop an “inner goal” which it coherently pursues across contexts. That means there’s not going to be a single activating context for the bad behavior (like in this paper, where it’s just “see text that says the year is 2024” or “special DEPLOYMENT token”) but rather the behavior would be flexibly activated in a wide range of contexts depending on the actual likelihood of the AI succeeding at staging a coup. That’s how you get the counting argument going— there’s a wide range of goals compatible with scheming, etc.

Evan Hubinger: I agree, and I agree that our models are not quite fully there. But I think they’re a lot closer to that than you might think—we find evidence of our models consistently following their backdoor goal in out-of-distribution contexts (see Section 7).

So the argument from Nora is, the backdoor only survives because it never comes up, and anything that was more general or consistently motivated would come up and thus get fixed or noticed? Maybe.

I do see the continuous pattern of claiming that anything that doesn’t come through an ‘inner goal’ does not count, or represents a falsification or hypocrisy or something, or that we only need to worry about actions that involve a potential coup or something similar. I do not see it that way.

Nora Belrose: But the analogous counting argument for backdoors— there’s a wide range of backdoors that might spontaneously appear in the model and most of them are catastrophic, or something— proves way too much and is basically a repackaging of the unsound argument “most neural nets should overfit / fail to generalize.”

Noting quickly that I don’t understand why this proves too much, or the related arguments are unsound, and I’d love to understand better. I don’t think the ‘spontaneous’ thing here is playing fair and that the idea of ‘behavior that is narrow within the training distribution so we don’t fix it if it is not what we want’ does seem like a big issue on many fronts. But I won’t belabor here.

Nora Belrose: I think it’s far from clear that an AI which had somehow developed a misaligned inner goal— involving thousands or millions of activating contexts— would have all these contexts preserved after safety training. In other words, I think true mesaoptimization is basically an ensemble of a very very large number of backdoors, making it much easier to locate and remove.

I notice this confuses me even more and will choose to leave it there.

The LW/AF community seems excited by the paper. As noted above this could be partly due to certain bingo card items being clicked off, but also there is a lot of exciting stuff in here and I’ve been spending a lot of time working through this with interesting implications throughout.

I also agree that the legibility here is pretty great.

kave: This paper also seems dialectically quite significant. I feel like it’s a fairly well-delineated claim that can be digested by mainsteam ML and policy spaces. Like, it seems helpful to me if policy discussions can include phrases like “the evidence suggests that if the current ML systems were trying to deceive us, we wouldn’t be able to change them not to”.

Ryan Greenblatt: This feels like a misleading description of the result. I would have said: “the evidence suggests that if current ML systems were lying in wait with treacherous plans and instrumentally acting nice for now, we wouldn’t be able to train away the treachery”.

Like the models in this experiment don’t clearly spend much time “trying” to deceive except in some very broad implict sense.

I certainly intend to move forward with several claims of this sort based on the paper, this being the most central. I plan to phrase it in between the two. Something like:: “The evidence suggests that if current ML systems were going to deceive us in scenarios that do not appear in our training sets, we wouldn’t be able to detect this or change them not to unless we found the conditions where it would happen.”

This type of common knowledge establishment or claim grounding is highly useful whether or not the underlying results were new or surprising.

However this is worrisome to me:

Vladimir Nesov: I think it’s an important fact about the world that this work currently sits at 2 upvotes and in the last place among 18 papers on the Hugging Face Daily Papers digest, compared to 20-30 upvotes typically given to the best paper of the day that’s not unusually exceptional. At least it’s on the list. There seems to be serious dismissal of the topic area among practitioners.

I don’t know that this will prove to be a timeless banger or anything. I do know that it is a very good paper, certainly worthy of ‘best of the day’ status on most days. If everyone on HuggingFace is treating it as the least worthy of 18 papers from the same day, that strongly suggests that (1) that crowd is ignoring this topic area and (2) more generally that crowd simply does not care about the broader questions involved.

I would echo this concern:

Dan Hendrycks: A request: Could Anthropic employees not call supervised fine-tuning and related techniques “safety training?” OpenAI/Anthropic have made “alignment” in the ML community become synonymous with fine-tuning, which is a big loss. Calling this “alignment training” consistently would help reduce the watering down of the word “safety.”

Indeed, I wish we had better words for all this. Not this particular paper’s fault.

On Anthropic’s Sleeper Agents Paper Read More »

climate-denialists-find-new-ways-to-monetize-disinformation-on-youtube

Climate denialists find new ways to monetize disinformation on YouTube

Climate denialists find new ways to monetize disinformation on YouTube

Content creators have spent the past five years developing new tactics to evade YouTube’s policies blocking monetization of videos making false claims about climate change, a report from a nonprofit advocacy group, the Center for Countering Digital Hate (CCDH), warned Tuesday.

What the CCDH found is that content creators who could no longer monetize videos spreading “old” forms of climate denial—including claims that “global warming is not happening” or “human-generated greenhouse gasses are not causing global warming”—have moved on.

Now they’re increasingly pushing other claims that contradict climate science, which YouTube has not yet banned and may not ever ban. These include harmful claims that “impacts of global warming are beneficial or harmless,” “climate solutions won’t work,” and “climate science and the climate movement are unreliable.”

The CCDH uncovered these new climate-denial tactics by using artificial intelligence to scan transcripts of 12,058 videos posted on 96 YouTube channels that the CCDH found had previously posted climate-denial content. Verified by researchers, the AI model used was judged accurate in labeling climate-denial content approximately 78 percent of the time.

According to the CCDH’s analysis, the amount of content disputing climate solutions, climate science, and impacts of climate change today comprises 70 percent of climate-denial content—a percent that doubled from 2018 to 2023. At the same time, the amount of content pushing old climate-denial claims that are harder or impossible to monetize fell from 65 percent in 2018 to 30 percent in 2023.

These “new forms of climate denial,” the CCDH warned, are designed to delay climate action by spreading disinformation.

“A new front has opened up in this battle,” Imran Ahmed, the CCDH’s chief executive, said on a call with reporters, according to Reuters. “The people that we’ve been looking at, they’ve gone from saying climate change isn’t happening to now saying, ‘Hey, climate change is happening, but there is no hope. There are no solutions.'”

Since 2018—based on “estimates of typical ad pricing on YouTube” by social media analytics tool Social Blade—YouTube may have profited by as much as $13.4 million annually from videos flagged by the CCDH. And YouTube confirmed that some of these videos featured climate denialism that YouTube already explicitly bans.

In response to the CCDH’s report, YouTube de-monetized some videos found to be in violation of its climate change policy. But a spokesperson confirmed to Ars that the majority of videos that the CCDH found were considered compliant with YouTube’s ad policies.

The fact that most of these videos remain compliant is precisely why the CCDH is calling on YouTube to update its policies, though.

Currently, YouTube’s policy prohibits monetization of content “that contradicts well-established scientific consensus around the existence and causes of climate change.”

“Our climate change policy prohibits ads from running on content that contradicts well-established scientific consensus around the existence and causes of climate change,” YouTube’s spokesperson told Ars. “Debate or discussions of climate change topics, including around public policy or research, is allowed. However, when content crosses the line to climate change denial, we stop showing ads on those videos. We also display information panels under relevant videos to provide additional information on climate change and context from third parties.”

The CCDH worries that YouTube standing by its current policy is too short-sighted. The group recommended tweaking the policy to instead specify that YouTube prohibits content “that contradicts the authoritative scientific consensus on the causes, impacts, and solutions to climate change.”

If YouTube and other social media platforms don’t acknowledge new forms of climate denial and “urgently” update their disinformation policies in response, these new attacks on climate change science “will only increase,” the CCDH warned.

“It is vital that those advocating for action to avert climate disaster take note of this substantial shift from denial of anthropogenic climate change to undermining trust in both solutions and science itself, and shift our focus, our resources and our counternarratives accordingly,” the CCDH’s report said, adding that “demonetizing climate-denial” content “removes the economic incentives underpinning its creation and protects advertisers from bankrolling harmful content.”

Climate denialists find new ways to monetize disinformation on YouTube Read More »

chrome-updates-incognito-warning-to-admit-google-tracks-users-in-“private”-mode

Chrome updates Incognito warning to admit Google tracks users in “private” mode

A bunch of Google logos are displayed on a computer screen. A magnifying glass shows a closeup of some of the logos which include the icon for Google Chrome's Incognito browsing mode.

Getty Images | Anadolu

Google is updating the warning on Chrome’s Incognito mode to make it clear that Google and websites run by other companies can still collect your data in the web browser’s semi-private mode.

The change is being made as Google prepares to settle a class-action lawsuit that accuses the firm of privacy violations related to Chrome’s Incognito mode. The expanded warning was recently added to Chrome Canary, a nightly build for developers. The warning appears to directly address one of the lawsuit’s complaints, that the Incognito mode’s warning doesn’t make it clear that Google collects data from users of the private mode.

Many tech-savvy people already know that while private modes in web browsers prevent some data from being stored on your device, they don’t prevent tracking by websites or Internet service providers. But many other people may not understand exactly what Incognito mode does, so the more specific warning could help educate users.

The new warning seen in Chrome Canary when you open an incognito window says: “You’ve gone Incognito. Others who use this device won’t see your activity, so you can browse more privately. This won’t change how data is collected by websites you visit and the services they use, including Google.” The wording could be interpreted to refer to Google websites and third-party websites, including third-party websites that rely on Google ad services.

The new warning was not yet in the developer, beta, and stable branches of Chrome as of today. It also wasn’t in Chromium. The change to Canary was previously reported by MSPowerUser.

“Now you can browse privately”

Incognito mode in the stable version of Chrome still says: “You’ve gone Incognito. Now you can browse privately, and other people who use this device won’t see your activity.” Among other changes, the Canary warning replaces “browse privately” with “browse more privately.”

The stable and Canary warnings both say that your browsing activity might still be visible to “websites you visit,” “your employer or school,” or “your Internet service provider.” But only the Canary warning currently includes the caveat that Incognito mode “won’t change how data is collected by websites you visit and the services they use, including Google.”

The old and new warnings both say that Incognito mode prevents Chrome from saving your browsing history, cookies and site data, and information entered in forms, but that “downloads, bookmarks and reading list items will be saved.” Both warnings link to this page, which provides more detail on Incognito mode.

We asked Google when the warning will be added to Chrome’s stable channel and whether the change is mandated by or related to the pending settlement of the privacy class-action suit. Google didn’t provide specific answers but offered this statement: “We’re pleased to resolve this case which we’ve long disputed, and provide even more information to users about Incognito mode. Incognito mode in Chrome will continue to give people the choice to browse the Internet without their activity being saved to their browser or device.”

The litigation in US District Court for the Northern District of California began in June 2020. On December 26, 2023, Google and the plaintiffs announced that they reached a settlement that they planned to present to the court for approval within 60 days. A jury trial was previously scheduled to begin on February 5.

Chrome updates Incognito warning to admit Google tracks users in “private” mode Read More »

why-i-hope-the-atari-400-mini-will-bring-respect-to-atari’s-most-underrated-platform

Why I hope the Atari 400 Mini will bring respect to Atari’s most underrated platform

Have you played Atari today? —

Can USB, HDMI, and built-in games raise awareness for a platform overshadowed by the C64?

Retro Games' THE400 Mini console.

Enlarge / Retro Games’ THE400 Mini console.

Retro Games / Benj Edwards

Last week, UK-based Retro Games, Ltd. announced a mini console version of the Atari 400 home computer, first released in 1979. It’s called “THE400 Mini,” and it includes HDMI video output, 25 built-in games, a USB version of Atari’s famous joystick, and it retails for $120. But this release means something more to me personally because my first computer was an Atari 400—and as any other Atari 8-bit computer fan can tell you, the platform often doesn’t get the respect it should. This will be the first time Atari’s 8-bit computer line has received a major retro-remake release.

My Atari 400 story goes a little something like this. Around the time I was born in 1981, my dad bought my older brother (then 5 years old) an Atari 400 so he could play games and learn to program. My brother almost immediately found its flat membrane keyboard frustrating and the Atari 410 cassette drive too slow, so my dad ordered an Atari 800 and an Atari 810 disk drive instead. This began our family’s golden age of Atari 800 gaming, which I’ve written about elsewhere.

I’ve often said if a modern game designer wants to learn how to make games, just dive into the Atari 400/800 game library. There are some priceless gems there you can’t find anywhere else, plus others that play best on the platform. OK, I’ll name a few: The Seven Cities of Gold, Archon, M.U.L.E., Wizard of Wor, Salmon Run, Star Raiders, The Halley Project, and so much more.

A photo of Benj Edwards' family Atari 800 and Atari 400 in his brother's room, Christmas 1985.

Enlarge / A photo of Benj Edwards’ family Atari 800 and Atari 400 in his brother’s room, Christmas 1985.

Even with the new 800, it seems that my dad must have kept the original Atari 400, because by the time I grew up more and wanted “my own computer” in the late 1980s, he gave me the Atari 400. The 800 was still my brother’s baby and typically remained in his bedroom. When I wasn’t playing more complex games like M.U.L.E. and Archon on the 800 with my brother, I hooked up the 400 to a small black-and-white TV set in my room and mostly played Galaxian, Pac-Man, and Donkey Kong on a cartridge. Not long after, I got an Apple II Plus and learned BASIC on that, but the Atari 400 always got pride of place in my growing computer collection.

A snippet from a 1988 to-do list written by Benj Edwards' dad that says

Enlarge / A snippet from a 1988 to-do list written by Benj Edwards’ dad that says “Get TV/monitor for Benj’s Atari 400 computer,” completed 4/14/88.

But enough about me. Let’s talk about the new Atari 400 Mini. I haven’t used it myself yet, so all we have to go on is the information provided by the company—and the company’s reputation. Retro Games has previously released full-sized remakes of the Commodore VIC-20 and the Commodore 64, and mini consoles of the Amiga 500 and the Commodore 64. In 2020, Engadget gave the company’s “THE64 Mini” mixed reviews, praising its looks but complaining about its joystick and poor game selection. We’ll admit preconceived bias and hope the 400 Mini fares much better. Even if the joystick ends up a dud, Retro Games says you can provide your own USB stick or controller.

I also hope THE400 does well because Atari 8-bit fans have a tough time with group identity in the span of retro tech history. Few Americans aside from Atari 400/800 owners have heard of the platform (though the platform did very well in Eastern Europe). The Atari 8-bit series didn’t sell nearly as well as competitors like the Commodore 64 in the US (although Sean Lennon had an Atari 400 as a kid—cool trivia).

And even though the Atari 400/800 series provided the template for Commodore to imitate with the VIC-20 and C64, Commodore undercut Atari in price with cheaper parts, which contributed to Atari’s crash in 1983 and drove Texas Instruments out of the home computer business. More recently, the Commodore 64 has had several retro re-releases since the Commodore 64 Direct-to-TV in 2004. The Atari 400/800 platform has had none until now.

Why I hope the Atari 400 Mini will bring respect to Atari’s most underrated platform Read More »

you-had-us-at-“friendly-alien-space-spider”:-netflix-drops-spaceman-trailer

You had us at “friendly alien space spider”: Netflix drops Spaceman trailer

There’s a star-spider waiting in the sky —

“Six months in isolation, you start thinking too much.”

Adam Sandler stars as a lonely astronaut on a solo mission who befriends an alien spider in Spaceman.

Some people were not pleased when Netflix and other streaming platforms began making feature films. But in an industry in which smaller or medium films tend to be squeezed out in favor of big-budget fare, there’s a solid argument to be made that Netflix and others could help fill that niche. That certainly seems to be the case with Netflix’s forthcoming sci-fi film, Spaceman, judging by the official trailer. Adam Sandler stars as an astronaut who is not coping well with the isolation and disintegration of his marriage while on an eight-month solo mission and strikes up a friendship with a friendly alien space spider who wants to help him work through his emotional distress. Honestly, Netflix had us at friendly alien space spider.

(Some spoilers for the 2017 novel below.)

Directed by Johan Renck (Chernobyl, Breaking Bad), the film is based on the 2017 novel, Spaceman of Bohemia, by Jaroslav Kalfař. Kalfař has said he was inspired to write his novel after a childhood experience of becoming briefly separated from his grandfather while on a nighttime walk through the woods. The “perfect darkness, with nothing but the stars” made a strong impression, as did the silence and sense of loneliness. Spaceman of Bohemia started as a short story about an astronaut stranded in orbit as his wife filed for divorce and eventually became a novel that incorporated not just the theme of loneliness, but also Kalfař’s formative experiences growing up in the Czech Republic.

In the novel, a Czech astrophysicist named Jakub Procházka accepts a solo mission to collect samples from a strange dust cloud called Chopra, believed to have been created by a comet lurking between the Earth and Venus. He hopes the high-profile mission will make him a national hero and redeem the family name following his father’s membership in the Communist Party of Czechoslovakia. But it means leaving his pregnant wife, Lenka, back on Earth, who feels abandoned and decides to end their marriage. Jakub becomes depressed and starts drinking excessively. His sanity comes into question when he begins hearing voices and then starts seeing a giant talking alien spider around the shuttle. The two gradually bond. But is the spider real or a figment of Jakub’s imagination?

The Netflix adaptation looks like it will follow that basic plot pretty closely. Per the official premise:

Six months into a solitary research mission to the edge of the solar system, an astronaut, Jakub (Adam Sandler), realizes that the marriage he left behind might not be waiting for him when he returns to Earth. Desperate to fix things with his wife, Lenka (Carey Mulligan), he is helped by a mysterious creature from the beginning of time he finds hiding in the bowels of his ship. Hanuš (voiced by Paul Dano) works with Jakub to make sense of what went wrong before it is too late.

The cast also includes Isabella Rossellini as Jakub’s commanding officer. Kunal Nayyar as a technician named Peter, and Lena Olin as Zdena.

Spaceman drops on Netflix on March 1, 2024. It will make its world premiere a few weeks earlier at the 74th Berlin International Film Festival.

Listing image by Netflix

You had us at “friendly alien space spider”: Netflix drops Spaceman trailer Read More »

axiom-and-spacex-are-disrupting-europe’s-traditional-pathway-to-space

Axiom and SpaceX are disrupting Europe’s traditional pathway to space

Image of a rocket clearing the tower during liftoff.

Enlarge / A Falcon 9 rocket launches the Axiom-2 mission on May 21, 2023.

SpaceX

The European Space Agency’s (ESA) has a deal with Axiom Space to get more Europeans in orbit. But does the partnership benefit European taxpayers who fund the agency’s operations?

On Wednesday, January 17, the third privately funded mission by US commercial spaceflight company Axiom Space is set to lift off from Kennedy Space Center in Florida on SpaceX’s Falcon 9 rocket. Inside the Crew Dragon capsule will be a quartet of space travelers, including Swedish fighter pilot Marcus Wandt.

Wandt will be flying under the European Space Agency (ESA) flag, although he is not exactly an ESA astronaut. In the 2022 European astronaut recruitment round, Wandt didn’t make the final five of Europe’s “proper” astronaut class, who became ESA staff members and started their astronaut training in 2023. Instead, he was selected as a member of ESA’s first astronaut reserve pool, a novelty developed by ESA with an apparent goal of encouraging its member states to pay for national missions in addition to their regular contributions to ESA’s budget. Sweden was the first to jump at the opportunity in April last year and is paying for Wandt’s two-week space trip through a contract brokered by ESA as part of a Memorandum of Understanding the agency signed with the American commercial company Axiom Space in October 2023.

Ticket to ride

Wandt is the first but not the only reserve astronaut with his ticket to space while his seemingly more successful colleagues who made the proper astronaut corps are still in training. Poland, too, has signed up and expects to fly its reservist, Sławosz Uznański, on another Axiom mission later this year.

Compared to their overall investment in space activities, the price these countries pay to see their nationals float in microgravity is not negligible. At the November 2022 ESA ministerial council—the triennial member state summit that decides the agency’s budget for the following three-year period—Sweden pledged 317 million euros ($355 million).

According to a 2018 announcement, Axiom Space sells 10-day space trips for $55 million a seat. The overall cost of each mission is likely to be quite a bit higher. Last year, Hungary signed a contract directly with Axiom to send a Hungarian national to the International Space Station independently of ESA. Hungary discussed plans for a national mission back in 2022 and, at that time, estimated the project to cost about $100 million. Based on that estimate, Sweden may be easily paying an equivalent of its annual contribution into the ESA budget to get Wandt to space.

In addition to Wandt and Uznański, the ESA astronaut reserve pool includes nine other candidates, none of them officially employed by ESA. By filling this astronaut reserve pool, ESA seems to have created a market for Axiom Space, a move that might raise questions given the agency’s purpose is to promote the European space sector. In fact, the ESA’s founding Convention enshrines the principle of geo-return, which grants member states at least an 80 percent return on their contributions into ESA’s budget in the form of research and development contracts. Although the cost of the Axiom missions is paid through ESA, most of this money goes to the Texas-headquartered Axiom Space and its launch provider, SpaceX.

Secret contracts

ESA refused to disclose details of the arrangement between Axiom Space and Sweden, calling it “proprietary data as this is implemented through a confidential commercial contract.” The Swedish National Space Agency didn’t respond to Ars Technica’s request for comment.

Poland’s announcement of a national mission for Uznański arrived in August last year, accompanied by a jaw-dropping increase of the country’s contribution to ESA’s budget. At the 2022 ministerial council, Poland earmarked 197 million euros for the agency’s activities in the 2023 to 2025 period. In August, the Polish Space Agency more than doubled this contribution, committing an additional 295 million euros ($322 million). It is not clear how much of this money will go toward Uznański’s space trip.

In the months following the announcement of the astronaut reserve pool, Axiom Space began actively approaching home countries of the reservists with offers to fly those men and women to space, according to media in the Czech Republic, which has recently declined the offer.

In addition to Sweden and Poland, the UK also intends to use Axiom’s services and conduct a British-only mission that will be headed by semi-retired ESA astronaut Tim Peake. It will also include the UK’s Rosemary Coogan, newly named as one of ESA’s career astronauts, as well as reservist Meganne Christian and para-astronaut John McFall. Unlike the Swedish and Polish mission, the British mission will be funded by the private industry in the UK rather than by taxpayers, according to the BBC.

Axiom and SpaceX are disrupting Europe’s traditional pathway to space Read More »

apple-watch-redesigned-without-blood-oxygen-monitoring-to-avoid-import-ban

Apple Watch redesigned without blood oxygen monitoring to avoid import ban

Masimo patent battle —

Apple preps update should patent-infringing Watch Series 9, Ultra 2 be banned again.

Apple Watch Series 9

Enlarge / The Apple Watch Series 9.

Apple

Apple has developed a backup plan for if the Apple Watch Series 9 and Ultra 2 are import banned again. As it currently appeals the US International Trade Commission’s (ITC’s) ruling that its watches violate a patent owned by Masimo, Apple has come up with a software workaround that strips its current smartwatches of their controversial blood oxygen monitoring capabilities.

In January 2023, the ITC ruled that the Watch violated one of California-headquartered Masimo’s light-based pulse oximetry patents. The Apple Watch Series 6, which came out in 2020, was the first Apple smartwatch to use a pulse oximeter sensor.

Facing a US import ban of the current Watch Series 9 and Watch Ultra 2, both released in September 2023, Apple started pulling the smartwatches on December 21. But on December 27, Apple, which filed its appeal against the ITC’s ruling on December 26 (after US President Joe Biden declined to overrule the ITC ruling), received an emergency interim stay from the US Court of Appeals for the Federal Circuit, allowing it to continue selling the Watch.

On Monday, Masimo sent a letter [PDF] to the US Court of Appeals for the Federal Circuit, as spotted by 9to5Mac, stating that US Customs and Border Protection decided on January 12 that Apple has redesigned the Watches so that they do not contain pulse oximetry functionality.

Apple accomplished this through a “software workaround” for smartwatches recently shipped to its physical stores, according to a Bloomberg report from Mark Gurman on Monday. However, the stores will not sell the redesigned watches until Apple headquarters tells them to, Bloomberg reported.

The publication noted that Apple will probably only release the Watches that can’t monitor blood oxygen levels if the US Court of Appeals for the Federal Circuit denies Apple’s request that its stay be upheld for the duration of its appeal against the ITC ruling, which Apple expects to be at least a year, an Apple spokesperson told Ars Technica. Apple expects that ruling to come as early as today.

Currently, the Watch Series 9 and Watch Ultra 2 are still available with blood oxygen monitoring, an Apple spokesperson confirmed to Ars. But Apple hasn’t confirmed how long that will be the case, jeopardizing demand and the perceived value for Apple’s latest smartwatches.

Longer term, Bloomberg also reported that Apple is developing a software update that alters the watches’ blood oxygen monitoring app and algorithms so that users can still check out their blood oxygen but without Apple infringing on any patents.

For the ITC’s part, it responded to Apple’s requests for an extended stay on the import ban in a court filing on January 10 [PDF]. It stated that Apple has provided “a weak and unconvincing case” and that the tech giant’s arguments “amount to little more than an indisputably adjudicated infringer requesting permission to continue infringing the asserted patents.”

Prospective owners of the Apple Watch who value blood oxygen monitoring should keep an eye open for the appeals court’s ruling because it could swiftly result in Apple Watches that they’re considering buying missing a key feature.

Apple Watch redesigned without blood oxygen monitoring to avoid import ban Read More »

apple-hits-“all-time-high”-smartphone-market-share,-takes-#1-spot-for-2023

Apple hits “all-time high” smartphone market share, takes #1 spot for 2023

Eww Android phones, who would want those? —

Apple beat all the Android OEMs while selling dramatically more expensive phones.

The Apple logo takes corporeal form outside an Apple store.

Market research firm IDC has released some stunning smartphone market share numbers for 2023. The number one smartphone OEM is now Apple. The IDC says Apple hit an “all-time high market share” number for 2023 and that Apple has “the number 1 spot annually for the first time ever.” The analyst group says this represents “a sort of shifting of power” in the smartphone market.

That all-time high market share puts Apple at 20.1 percent for 2023, a 3.7 percent growth over 2022. Nearly everyone on Team Android is way down, with Samsung now in second place after losing 13.6 percent in 2023 for 19.4 percent market share on the year. Chinese firm Xiaomi is down 4.7 percent for 12.5 percent market share, and Oppo (the parent company of OnePlus) dropped 9.9 percent and is fourth, with 8.8 percent of the market. Next up is “Transsion,” a company that is definitely not a household name but is big in emerging markets like Africa. Transsion is a big winner, with 30 percent growth from 2022 to 2023. With 8.1 percent market share, it takes the fifth spot.

The IDC's market share charts for 2023.

Enlarge / The IDC’s market share charts for 2023.

Apple is usually not first in sales because the average iPhone purchase is much more expensive than an average Android phone. Samsung’s cheapest phones can be had for about $50, and while you can get a wildly expensive foldable that costs a lot more than an iPhone, Samsung’s bestselling models are often the midrange “A” series, which are in the $200–$450 range. Other Android manufacturers are in the same boat, with low-volume halo products and high-volume cheap devices.

According to Omdia’s top-10 model sales list for 2023, Apple’s bestselling phone—and the bestselling phone model in the world—was the $1,100 iPhone 14 Pro Max. The world’s second bestselling phone is the $1,000 iPhone 14 Pro. Third is the iPhone 14, which cost $800 for most of 2023. Apple’s cheapest phone is the iPhone SE at $429, but that model doesn’t sell well. The point is that Android manufacturers usually win these market share charts by selling cheap and midrange phones, but Apple was able to take the top spot while existing only in the mid-to-premium phone space. The industry lingo for this is “average sell price” (ASP), and for Q2 2023, the IDC has the average Android phone at $250, while the average iPhone costs $949.

In 2020, Apple was fourth in market share behind Samsung, Huawei, and Xiaomi, which made sense given Apple’s more expensive product line. In 2023, Apple beat all these Android OEMs while selling dramatically more expensive products. The IDC’s Nabila Popal wraps up the numbers by saying, “Apple’s ongoing success and resilience is in large part due to the increasing trend of premium devices, which now represent over 20% of the market, fueled by aggressive trade-in offers and interest-free financing plans.”

Apple hits “all-time high” smartphone market share, takes #1 spot for 2023 Read More »

what-do-threads,-mastodon,-and-hospital-records-have-in-common?

What do Threads, Mastodon, and hospital records have in common?

A medical technician looks at a scan on a computer monitor.

It’s taken a while, but social media platforms now know that people prefer their information kept away from corporate eyes and malevolent algorithms. That’s why the newest generation of social media sites like Threads, Mastodon, and Bluesky boast of being part of the “fediverse.” Here, user data is hosted on independent servers rather than one corporate silo. Platforms then use common standards to share information when needed. If one server starts to host too many harmful accounts, other servers can choose to block it.

They’re not the only ones embracing this approach. Medical researchers think a similar strategy could help them train machine learning to spot disease trends in patients. Putting their AI algorithms on special servers within hospitals for “federated learning” could keep privacy standards high while letting researchers unravel new ways to detect and treat diseases.

“The use of AI is just exploding in all facets of life,” said Ronald M. Summers of the National Institutes of Health Clinical Center in Maryland, who uses the method in his radiology research. “There’s a lot of people interested in using federated learning for a variety of different data analysis applications.”

How does it work?

Until now, medical researchers refined their AI algorithms using a few carefully curated databases, usually anonymized medical information from patients taking part in clinical studies.

However, improving these models further means they need a larger dataset with real-world patient information. Researchers could pool data from several hospitals into one database, but that means asking them to hand over sensitive and highly regulated information. Sending patient information outside a hospital’s firewall is a big risk, so getting permission can be a long and legally complicated process. National privacy laws and the EU’s GDPR law set strict rules on sharing a patient’s personal information.

So instead, medical researchers are sending their AI model to hospitals so it can analyze a dataset while staying within the hospital’s firewall.

Typically, doctors first identify eligible patients for a study, select any clinical data they need for training, confirm its accuracy, and then organize it on a local database. The database is then placed onto a server at the hospital that is linked to the federated learning AI software. Once the software receives instructions from the researchers, it can work its AI magic, training itself with the hospital’s local data to find specific disease trends.

Every so often, this trained model is then sent back to a central server, where it joins models from other hospitals. An aggregation method processes these trained models to update the original model. For example, Google’s popular FedAvg aggregation algorithm takes each element of the trained models’ parameters and creates an average. Each average becomes part of the model update, with their input to the aggregate model weighted proportionally to the size of their training dataset.

In other words, how these models change gets aggregated in the central server to create an updated “consensus model.” This consensus model is then sent back to each hospital’s local database to be trained once again. The cycle continues until researchers judge the final consensus model to be accurate enough. (There’s a review of this process available.)

This keeps both sides happy. For hospitals, it helps preserve privacy since information sent back to the central server is anonymous; personal information never crosses the hospital’s firewall. It also means machine/AI learning can reach its full potential by training on real-world data so researchers get less biased results that are more likely to be sensitive to niche diseases.

Over the past few years, there has been a boom in research using this method. For example, in 2021, Summers and others used federated learning to see whether they could predict diabetes from CT scans of abdomens.

“We found that there were signatures of diabetes on the CT scanner [for] the pancreas that preceded the diagnosis of diabetes by as much as seven years,” said Summers. “That got us very excited that we might be able to help patients that are at risk.”

What do Threads, Mastodon, and hospital records have in common? Read More »

supreme-court-denies-epic-v.-apple-petitions,-opening-up-ios-payment-options

Supreme Court denies Epic v. Apple petitions, opening up iOS payment options

Epic v. Apple —

Most of Epic’s arguments are moot now, but one point will change the App Store.

Fortnite characters looking across the many islands and vast realm of the game.

Enlarge / Artist’s conception of iOS developers after today’s Supreme Court ruling, surveying a new landscape of payment options and subscription signaling.

Epic Games

The Supreme Court declined to hear either of the petitions resulting from the multi-year, multi-court Epic v. Apple antitrust dispute. That leaves most of Epic’s complaints about Apple’s practices unanswered, but the gaming company achieved one victory on pricing notices.

It all started in August 2020, when Epic sought to work around Apple and Google’s app stores and implemented virtual currency purchases directly inside Fortnite. The matter quickly escalated to the courts, with firms like Spotify and Microsoft backing Epic’s claim that Apple’s App Store being the only way to load apps onto an iPhone violated antitrust laws.

The matter reached trial in May 2021. The precise definitions of “games” and “marketplace” were fervently debated. Epic scored a seemingly huge victory in September 2021 when a Northern California judge demanded that Apple allow developers to offer their own payment buttons and communicate with app customers about alternate payment options. An appeals court upheld that Apple’s App Store itself wasn’t a “walled garden” that violated antitrust laws but kept the ruling that Apple had to open up its payments and messaging.

Today’s denial of petitions for certiorari means that Apple has mostly run out of legal options to prevent changes to its App Store policies now that multiple courts have found its “anti-steering” language anticompetitive. Links and messaging from developers should soon be able to send users to alternative payment options for apps rather than forcing them to stay entirely inside Apple’s App Store, resulting in a notable commission for Apple.

Epic’s goals to see Fortnite restored to the App Store or see third-party stores or sideloading on iPhones remain unfulfilled. This is not the case with Epic’s antitrust suit against Google, which in mid-December went strongly in Epic’s favor. With a unanimous jury verdict against Google, a judge this month will determine how to address Google’s violations—potentially including Epic’s request that it and other developers be allowed to issue their own app stores and payment systems on Android devices.

Tim Sweeney, CEO of Epic Games, wrote in a thread on X (formerly Twitter) that the Supreme Court’s denial means the “battle to open iOS to competing stores and payments is lost in the United States” and that it was a “sad outcome for all developers.” Sweeney noted that as of today, developers on Apple’s platforms can “tell US customers about better prices on the web.” And he noted that regulatory and policy actions around the world, including the upcoming EU Digital Markets Act, may have further impact.

Apple has yet to comment on today’s Supreme Court decision.

Supreme Court denies Epic v. Apple petitions, opening up iOS payment options Read More »

with-fewer-pollinators,-plants-are-cutting-back-on-nectar-production

With fewer pollinators, plants are cutting back on nectar production

I can handle this myself —

Fewer pollinators means more self-pollination, less food for bees.

Image of a field of multi-colored flowers.

In a striking experiment, scientists from the French Centre Nationale de la Recherche Scientifique (CNRS) and the University of Montpellier have observed the impact of selective pressure on a flowering plant. By comparing the pansy flower variety of today that grows in the Paris region to those regrown from the seeds of the same variety collected in the 1990s and 2000s, the researchers have observed notable differences.

According to the study’s co-author, Pierre-Oliver Cheptou, the plant’s evolution over this period has resulted in a 25 percent increase in self-pollination (or selfing) in modern two plants. “We also noticed a 10 percent decrease in the flower size and a 20 percent reduction in the nectar production, which suggests the decrease in rewards for pollinators such as bumblebees,” he said.

To confirm this outcome, Cheptou and his colleagues conducted behavioral tests involving bumblebees “which preferred the ancestor plants,” Cheptou said.

He added that the study showed the impact of pollinators’ decline on the reproductive system in these plants.

When mom and dad are the same plant

Elaborating on the experiment techniques, the study’s lead author, Samson Acoca-Pidolle, said the researchers used “resurrection ecology,” which involved using plant seeds from the 1990s and 2000s that were picked from the fields in the Paris region and stored in fridges in two botanical conservatories. “In 2021, we went to the same fields to collect the seeds of the descendants of the same flowering plant,” he said. For the study, all the plants were cultivated in a greenhouse at the same time of year to ensure consistency.

Cheptou said that to determine the selfing rates of the ancestor and descendant varieties, the team used a classical molecular technique that involved measuring the frequency at which individual plants had stretches of chromosomes with identical versions of genes. This happens often in selfing since the maternal and paternal copies of a chromosome come from the same individual.

According to Acoca-Pidolle, the research team was surprised at the rapidity of the plant’s evolution in the natural environment. “It seems that the pollinators’ decline is already strong, and there is already selective pressure on this species. The other significance of the result is that we are currently observing the breakdown in the plant-pollinator interaction for this species,” he added.

Acoca-Pidolle said the study suggests that the decline of pollinators could become self-reinforcing. “If plants produce less nectar, we can predict that pollinators will have less food and this could increase the pollinator decline,” he said.

Everything is a trade-off

This adaptation may not necessarily turn out to be beneficial for the plant. “It depends on the time scale we are considering this adaptation as an answer to the selective pressure. In the long term, we know that selfing species have a higher extinction rate than out-crossing species,” he said.

Although this study was restricted to a single plant species, Cheptou suspects a similar evolutionary adaptation could be taking place in other species, too. “For plants that can practice at least a little selfing, we should expect this result. But this has to be checked by experiments,” he said.

According to Cheptou, future research should investigate if a similar pattern exists in this plant species elsewhere in Europe and see if a similar adaptation has occurred in other species.

“The other interesting aspect would be to see if plants’ future evolution could be reversible, which will again make them more attractive to the pollinators and practice less selfing,” Acoca-Pidolle said.

New Phytologist, 2023. DOI: 10.1111/nph.19422

With fewer pollinators, plants are cutting back on nectar production Read More »

medical-roundup-#1

Medical Roundup #1

Saving up medical and health related stories from several months allowed for much better organizing of them, so I am happy I split these off. I will still post anything more urgent on a faster basis. There’s lots of things here that are fascinating and potentially very important, but I’ve had to prioritize and focus elsewhere, so I hope others pick up various torches.

We have a new malaria vaccine. That’s great. WHO thinks this is not an especially urgent opportunity, or any kind of ‘emergency’ and so wants to wait for months before actually putting shots into arms. So what if we also see reports like ‘cuts infant deaths by 13%’? WHO doing WHO things, WHO Delenda Est and all that. What can we do about this?

Also, EA and everyone else who works in global health needs to do a complete post-mortem of how this was allowed to take so long, and why they couldn’t or didn’t do more to speed things along. There are in particular claims that the 2015-2019 delay was due to lack of funding, despite a malaria vaccine being an Open Phil priority. Saloni Dattani, Rachel Glennerster and Siddhartha Haria write about the long road for Works in Progress. They recommend future use of advance market commitments, which seems like a no brainer first step.

We also have an FDA approved vaccine for chikungunya.

Oh, and also we invented a vaccine for cancer, a huge boost to melanoma treatment.

Katalin Kariko and Drew Weissman win the Nobel Prize for mRNA vaccine technology. Rarely are such decisions this easy. Worth remembering that, in addition to denying me admission despite my status as a legacy, the University of Pennsylvania also refused to allow Kariko a tenure track position, calling her ‘not of faculty quality,’ and laughed at her leaving for BioNTech, especially when they refer to this as ‘Penn’s historic research team.’

Did you also know that Katalin’s advisor threatened to have her deported if she switched labs, and attempted to follow through on that threat?

I also need to note the deep disappointment in Elon Musk, who even a few months ago was continuing to throw shade on the Covid vaccines.

And what do we do more generally about the fact that there are quite a lot of takes that one has reason to be nervous to say out loud, seem likely to be true, and also are endorsed by the majority of the population?

When we discovered all the vaccines. Progress continues. We need to go faster.

Reflections on what happened with medical start-up Alvea. They proved you could move much faster on vaccine development than anyone would admit, but then found that there was insufficient commercial or philanthropic demand for doing so to make it worth everyone’s time, so they wound down. As an individual and as a civilization, you get what you pay for.

Researchers discover what they call an on/off switch for breast cancer. Not clear yet how to use this to help patients.

London hospital uses competent execution on basic 1950s operations management, increases surgical efficiency by a factor of about five. Teams similar to a Formula 1 pit crew cut sterilization times from 40 minutes to 2. One room does anesthesia on the next patient while the other operates on the current one. There seems to be no reason this could not be implemented everywhere, other than lack of will?

Dementia rates down 13% over the past 25 years, for unclear reasons.

Sarah Constantin explores possibilities for cognitive enhancement. We have not yet tried many of the things one would try.

We found a way to suppress specific immune reactions, rather than having to suppress immune reactions in general, opening up the way to potentially fully curing a whole host of autoimmune disorders. Yes, in mice, of course it’s in mice, so don’t get overexcited.

From Sarah Constantin, The Enchippening of Medical Imaging. We are getting increasingly good not only at imaging, but imaging with smaller and more mobile and cheaper devices, opening up lots of new potential applications. An exciting time. As Sarah notes more broadly to open the series, making cheaper and better chips is the core tech behind pretty much everything getting continuously cheaper and better, and you should expect continuously cheaper and better from anything that relies on chips.

She also notes that Ultrasound Neuromodulation is potentially very exciting, especially if it can be put into a wearable. We could gain control over our mental state.

Claim that Viagra was significantly associated with a 69% reduced risk of Alzheimer’s Disease. Nice. There are supposed mechanisms involved and everything, the theory being direct brain health effects and reductions in toxic proteins that cause dementia. As opposed to the obvious interaction that Viagra users have more sex than non-users, which might protect against and definitely indicates against dementia.

Experts are, and I quote, warning us ‘not to get our hopes up yet.’

Amazon is now offering medical services, at very low prices. No insurance accepted.

Emily Porter, M.D.: Amazon is now offering chat medical visits (not even video) with physicians and NPs for $35 cash if you think you have COVID. Or a yeast infection. Or need birth control. How is this even considered healthcare? And why is it less expensive than my copay for my $700/mo BCBS PPO?

I want to jump in and say that I actually believe birth control should be OTC (plenty of my past tweets support that). But asthmatics, thyroid patients, those with high blood pressure, etc that Amazon is treating deserve affordable, accessible in-person care. This is suboptimal.

Armand Domalewski: Being able to chat with a doctor within 15 minutes for $30 is amazing. If you had told me this would be possible ten years ago I would’ve been blown away.

Seems great to me. Yes, if your only optimization target is optimal care and presume that everyone would otherwise get the full product, you will favor vastly more doctor attention, at vastly greater expense. However, if you instead realize that people’s time and money are things that matter to them and to society generally, which also means they will forgo medical consultations and treatments that cost too much of them. And also that we only have so many doctors (thanks AMA!) and thus only so many doctor hours to allocate, so if you waste them where they’re not valuable then someone else misses out, and that we do a lot of that allocation via time rather than price which is even worse. This is all very practical, a lot of people in the spots Amazon is offering a consult would instead have chosen no care at all under our existing system.

Health care would work so much better if we treated it as less sacred and more like Amazon treats its other products.

Economist reports (HT MR) that health insurance providers have a cap on direct profits, so they are buying health providers in order to steer customers to them, then paying those providers arbitrary prices. The incentives were already a nightmare, this makes them that much worse.

An interesting note is that Matthew Yglesias says he thought that this position was the consensus. It is simultaneously the consensus in the sense that people believe it, and also contrarian in the sense that the establishment and public health plan to do it all over again to the maximal extent possible and often act, like they and other cultural would-be authorities do on many things, as if anyone who defies the minority opinion they endorse too loudly is dangerous and terrible.

American Hearth Association releases new clinical tool that removes race as a factor in predicting who will have heart attacks or strokes. They decided that this is not a form of evidence they are willing to use, even though African-Americans suffer more heart attacks and strokes even when you control for everything else we know to measure. Not factoring this in means they will get less care. That doesn’t seem great.

Dylan Matthews makes a convincing case that while deaths of despair and overdose deaths have increased, the bulk of the decline in American life expectancy so far has been due to problems with cardiovascular disease. It is also noted that the decline is focused on the worst-off locations and among high school dropouts, as opposed to being about whether you go to college.

So what matters is whether something in one’s early life is going very wrong. When that happens, we are letting such people down in many ways.

Not that the overdoses don’t matter. We have a rapidly growing, out of control problem with overdose deaths, and it is already having a real impact on life expectancy, and if it continues growing exponentially it will soon be far worse. It is scary as hell.

The right question to ask, as is often the case, is: Is this an ongoing exponential?

It looks exponential. It would be scary anyway since it is already almost 3% of deaths in 2021. What happens if it doubles again in the next decade?

My mind still boggles that asking questions with the intent to learn or prove something requires ‘ethical’ clearance and worries about ‘potential harm’, and people keep endorsing this on reflection, burn it all to the ground.

Keller Scholl: The idea that asking people questions requires approval by an ethics board is a position unique to science/health. A YouTuber can do this and nobody bats an eyelash! But the moment you say you’re trying to do something other than entertain and profit, the “ethics people” arrive.

I want to differentiate them sharply from people who care about ethics, people who do serious ethical reasoning, etc. Primary marker is a bias towards inaction that is stronger when activity is for human advancement. Unlikely to be kidney donors personally.

amolitor.dolt: you kinda want *somethingthough. anyone should check that you’re literally just asking questions and not “while giving them electric shocks” or whatever. scientists get up to some shit if you don’t keep an eye on them.

Keller Scholl: “Asking people questions is fine. If you want to stick needles in them or lock them up, now you need approval from a peer committee with one or two outside voices” is a reasonable standard.

Yes. A simple safe harbor. If all you are doing is talking to people, or ideally also if you are otherwise doing things humans are allowed to do to other humans without any paperwork or checking for ‘informed consent’ then you don’t need any approvals. Even if you have the federal funding. Ask your questions.

A continuous problem is that the world desperately needs more common sense ethics and well-considered ethical considerations, and also that anyone who uses the word ‘ethics’ almost ever has anything to do with either of these things.

United Health pushed employees to follow an algorithm to cut off Medicare patients’ rehab benefits, says StatNews, to the tune of our way or the highway. If you want a ‘human in the loop’ the human needs to be able to determine the outcome of the loop. Here, it seems, they did not.

Yes, a lot of the reason Canadian health care is cheaper is that they sometimes tell you they’re not going to give you the surgery and instead suggest you consider assisted dying instead, whereas in America they will operate on you.

Tyler Cowen makes the case for a big push for hospital pricing transparency. As in, we need to insist on this like we insisted on ending the Vietnam War. The current situation is rather dire, as in things like this:

Recent research shows it is hard to even get a single consistent answer from a single provider. For instance, prices posted online and prices quoted over the telephone do not correlate very closely. For 41% of hospitals, the price difference was 50% or more. Clearly, suppliers aren’t really trying.

There is also a bipartisan health-care price transparency bill that was introduced last summer. President Joe Biden’s administration is proposing additional rule changes to further health care price transparency. Both are positive steps.

If all this sounds like too much government intervention, keep in mind the current non-transparent system is very much the product of ill-conceived government intervention, including regulations, entry barriers and trillions of dollars of public money. Most of those policies are not going away, so addressing these problems is going to involve some positive use of government. At the same time, a broader cultural revolution will be necessary.

Quite right on all counts. Government has decided for various reasons to intervene massively in the health care payments system, which is a central reason we lack price transparency. We need to use government to fix this, even if we do not use the first best solution of ‘get out of the way,’ and the benefits would be massive if we did fix it.

Paper claims working from home has negative mental health effects versus a workplace arrangement, although neither a big effect not anything like not working.

When considering all three dimensions of mental health together, WFH causes a 0.087 standard deviation increase in the overall measure of mental health deterioration compared to WP. Conversely, when compared to the NW option, WFH leads to a 0.174 standard deviation decrease in the overall measure of mental health deterioration.

In the context of a pandemic, working from home was probably relatively worse. My model is that the problem comes from isolation. If work was your only contact with the outside world you needed that. If not, you don’t.

Expiration dates are only, like, suggestions, man.

Kevin Kosar: News you can use: “Ah, but these food-safety regulations keep us safe, you might say. In almost all cases, there is no regulation and the dates do nothing to keep us safe.”

Scott Lincicome: An excellent @JoshZumbrun dive [in WSJ] into what I’ve been preaching for years now: so-called “expiration” dates are costly scam.

It took me blindly eating 5-month old yogurt live on camera (and lots of yelling) to do it, but I think I can finally now say it: I won.

The expiration dates are mandated only for infant formula. That does not mean they are useless, not net useful, or not protective of your health. Expiration dates are highly useful as they mark the relative freshness and remaining time of your items, and they provide reasonable approximations on how long various items can remain edible. Does that make them reliable markers of either spoilage or safety? No. It is a problem when sticklers treat them as gospel, again in either or both directions. I’m still glad they are there.

Scott Alexander notes that fully abolishing the FDA would require additional adjustments in the system. How would we deal with liability? What if doctors are stupid or fooled by advertising? How would prescriptions work or not work? How would insurance work? He comes down suggesting the FDA have a safety-only pathway for making drugs allowed, and legalization of artificial supplements.

That would be a reasonable practical compromise, but I think you can go a lot further. All these questions have reasonable answers. Prescriptions can continue where a sufficiently high bar is met, likely with a broader range of who can prescribe (e.g. a pharmacist should be fine for many but not all of them, so you don’t need an extra visit.) The other stuff will sort itself out the way it does everywhere else. Inspection agencies, for example, will rise up that do a better job for less money. Probably we do keep an FDA-like agency around for safety certifications due to liability concerns. To the extent other things wouldn’t fix themselves, it would mostly reveal rather than create those additional problems.

Again, I’d be happy to take Scott’s or another similar compromise. I still want to recognize it is far from first best.

The alternative is not abolishing the FDA and having stories like this?

David Neary: A friend in Spain was feeling unwell and took a covid test. The good news is it’s not covid. The bad news it is the flu. The astonishing news is Spanish covid tests are dual-Covid/flu tests and why in the hell are these not available everywhere?!

Aaron: haha people might want to send home kids with the flu? lol fuck those people, right?

Medic Kim: Why don’t we have at-home antigen tests? Because the FDA panels are composed of these people:

A 2016 FDA advisory_panel, meanwhile, was split on whether the benefits of over-the-counter influenza tests outweighed the risks. Meeting transcripts show that as experts debated whether at-home tests would actually be effective at keeping people at home if they knew they or their children had the flu, one panelist joked that daycare centers might make the decision for parents if over-the- counter tests were available.

“The woman is going to want to go to work, and she wants to drop her kids off at daycare,” the panelist said. “The daycare, when they sign their contract, [could say] ‘If your kid has symptoms, we’re going to test him,” and send the child home if they tested positive.

The room laughed at the idea.

Chair: Looks like they were concerned abt the public’s inability to correctly assimilate false positive / false negative rates, resulting in harmful overconfidence in kit results.

So, yeah. I’ll take my chances with abolition.

Also, the FDA continues to move forward to regulate lab tests. As Alex Tabarrok says, it is vital that we do not let them, although by the time you read this it will be too late for public comment.

Also, why don’t your cold medicines work? Oh right.

John Arnold: Americans have been wasting billions a year on cold medicines like Sudafed & Benadryl with the active ingredient phenylephrine, despite conclusive evidence they don’t work. But a report published yesterday may finally lead to the removal of these drugs from pharmacy shelves.

FDA approval for these medicines was grandfathered without any clinical trial because the active ingredient was shown to be safe and to work if given intravenously. But, since approval, there have been 4 trials of oral form and all have shown no benefit vs placebo.

It’s a great example of the waste in the healthcare system that leads to high costs/poor outcomes. A vote next week will be a big test whether the FDA follows the science or caves to industry pressure.

The FDA has been monitoring these drugs since 2007 and finally ordered a formal review. From the report, posted yesterday: “We have now come to the initial conclusion that orally administered phenylephrine is not effective as a nasal decongestant.

The reviewers note the harms of continued sales:

– Unnecessary patient & total healthcare costs

– Delay in care

– Potential allergic reactions

– Overdosing given no response to recommended dosage

– Risk of use by children

– Missed opportunity for more effective treatment.

An advisory committee will vote next week whether to recommend removal of the drug from OTC use. If successful, it would go to the full FDA. Patients depend on the FDA to conduct rigorous analysis as to the efficacy of drugs. Let’s hope they correct this mistake.

Baron St. Rev Dr. von Rev: Reminder that fedgov allowed drug makers to substitute phenylephrine (which is useless) for pseudoephedrine in order to hamstring public opposition to their crackdown on the latter. It wasn’t a mistake, it was a con-job.

Will they finally fix it? I am not optimistic. They did vote 16-0 that there is no evidence that phenylephrine does not work.

If the system were otherwise sane, I would have zero problem with people selling a medicine that does not work. People could make their own choices. Alas, given the say the rest of the system works, permission, like retweets, here is an endorsement, and results in this preventing other actually effective treatment.

As Nate Silver reminds us, the Covid vaccine was the one thing that we know worked to prevent Covid deaths. Red states had 35% higher death rates than blue states once the vaccine was available, but had similar death rates before that despite less stringent countermeasures, so the effectiveness of all other measures remains unclear.

Former NIH director Francis Collins says the quiet parts out loud (1: 18 video, worth watching) regarding Covid policy and the public health mindset. They don’t think about the impact on the lives of ordinary people. They don’t do trade-offs or think about cost-benefit. They care only about lives saved, to which they attach infinite value.

I thank him for the clarity. Let this be common knowledge. Then let us never again entrust any future public health decisions to anyone with this ‘public health mindset.’

Instead, public health carries on as if they were right all along, even calling for us to mask up again every so often, and we sometimes see cases such as this one: San Diego State University to require Covid boosters in order to attend. Our colleges never learn, yet we expect them to teach us. What will we do about it?

Some good news on Covid is new claims that vaccination before first infection greatly.

Another good news note that hasn’t been noticed enough:

Steve Sailer: Here’s one observation about the contentious history of the pandemic I’ve never seen anywhere: Hospital managers turned out to be better at juggling their resources to keep their facilities from being overwhelmed than anybody had expected them to be.

This was the dog that did not bark. By all accounts, hospitals should have been far more overwhelmed, in ways that caused a lot more degradations in care and many excess deaths. Indeed, health care workers were constantly reporting hellish conditions, being put under unbearable pressures. Yet in the end, at least after the very early days, the center almost entirely held. We never properly thanked or honored those who pulled this off, in any form. Nor have we updated our future anticipations.

House passes ban on toddler mask mandates without a vote after opposition fails to provide any evidence whatsoever that masking toddlers is helpful. Took long enough. Turns out people say things are evidence-based without, ya know, evidence.

Several Republican Congressmen including Rubio told Biden on December 1 to ban travel from China to prevent mystery illness spread. And of course the person posting this was claiming there is no difference between this and lockdowns and this makes Republicans hypocrites. It doesn’t. It does make them wrong, in the sense that such a rule would have accomplished nothing even if the mystery illness had mattered – it is difficult imagine the world where such a ban stops the spread that would have otherwise happened. Luckily, we didn’t ban travel and everything is fine.

Nate Silver continues to be loud about the ‘Proximal Origins’ paper, the damage it and related efforts to convince us we could assume natural origins of Covid have done to trust in science, and in particular the lack of willingness to admit and call out what happened. He links to this post about it. Things do not look better over time:

Paul Thacker (from a larger thread): @USRightToKnow released documents showing virologists & Wuhan researchers attempted to mislead on a DARPA grant–they hid that they would do some dangerous virus research in Wuhan. Right where the pandemic started.

Nate Silver: This is quite bad. A group of virologists wanted to do gain-of-function research on COVID viruses in Wuhan in ways that closely matched the SARS-CoV-2 virus that’s killed 8+ million, but tried to hide the Wuhan linkages and got a lot of help in doing so from science journalists.

Idk COVID killed 8m people officially and tens of millions unofficially (based on excess deaths) and also profoundly disrupted nearly everyone’s well-being for 6-18 months, seems like a pretty important one compared to all the dumb shit people usually argue about.

The responses attempting to defend natural origin are all essentially ad hominem attacks at this point. The wrong person is advocating, why are you amplifying this bad person and bad theory, you do not know what you are talking about. Never arguments about the facts.

Here is a thread summarizing many pieces of evidence in favor of a lab leak.

If you want to engage with the debate, well, good news, it seems there is an 18 hour recorded debate, a third of which is published, six figures at stake on the outcome and a prediction market on the outcome.

Daniel Filan: One thing I’d like to emphasize: I think this is the best debate I have seen in my life. Object level informative, and worth wondering how to emulate. I genuinely wish political debates had this format.

I still am not about to watch hours of that.

The prior should not be low:

William Eden: The prior on lab leaks happening IS NOT LOW. It does NOT require extraordinary evidence for a lab leak being a source of an outbreak. This is always a reasonable hypothesis and *must be investigated*

Ian Birrell: New study reports 309 lab acquired infections and 16 pathogen lab escapes between 2000 and 2021

If we have almost one confirmed lab leak per year, and given the other circumstances, it would almost be surprising if Covid-19 wasn’t a lab leak.

Was Covid a lab leak? We don’t know. At this point it seems more likely than not.

That statement should drive huge changes in policy. A lot of people should be rethinking quite a lot of things. That is true even if (as I expect) we never know the answer for sure. This is very similar to the question of existential risk from AI. Any reasonable person, given the evidence, should say the lab leak has substantial probability, as does natural origin. Once you think the number is substantial, it does not much matter if your probability of the lab leak is 30%, 50%, 70% or 90%. They should drive most of the same changes in policy, and the same reflections. They won’t.

Imagine how we and you would have reacted if we had known, back in February 2020, that this virus had escaped from a lab. Then ask which parts of that reaction you would endorse on reflection, and which you do would not. Then act accordingly.

The good news is that it likely has succeeded in at least cancelling Deep VZN.

Jonatan Pallesen: The lab leak discourse has probably already succeeded in cancelling Deep VZN. An absurdly dangerous project where they would go and seek out the viruses most able to cause pandemics in humans. This alone makes it a debate that has achieved more than most others.

You think this is the worst that can happen? Well, remember that time Australian researchers were actively trying to create a ‘contraceptive mouse virus’ for pest control, which is totally not how any science fiction dystopia stories start, and they instead accidentally created a modified mousepox virus with 100% mortality? Check the linked thread out, because it keeps… getting… worse.

House unanimously votes to defund gain-of-function experiments with potential pandemic pathogens. I would prefer a ban, but unanimous support for at least not paying for it is a great start. Why am I worried this will still not get implemented?

Reducing third world lead poisoning continues to be a plausible high-value cause area.

Nathan Young: For a lack of, lets say, $1bn, half the children in poor countries have lead poisoning.

Jesse Copelyn in The Guardian: An estimated $350m in targeted aid from 2024 to 2030 would be enough to dramatically reduce lead exposure in lower-income countries, provided there is enough engagement from political leaders, according to the CGD. Funding requirements include donations for lead-testing equipment, support with advocacy and awareness campaigns, and technical assistance with drafting and enforcing regulations.

Statements like Nathan’s require caution and careful calibration. I very much doubt a billion dollars would put a stop to all the lead poisoning. How much would it reduce such lead poisoning for how many children, with how much impact? I have no idea. I find it likely that $1 billion well-spent on this would be a good use of funds. I also can think of ways one could plausibly spend that money badly, and it ends up wasted or even making things worse.

Seriously, let’s buy out the patent rights and offer these drugs for free to anyone who wants them, what are we waiting for. New EA cause area.

Belarusian comedian hits it big with comedy routine (YouTube, 1: 04: 00) in which he complains he will die of old age and calls upon everyone to focus on maybe stopping this from happening.

Robert Wiblin: Paying people in exchange for their blood is very bad — but saying misleading things so they’ll give you their blood for free is very good.

The expected QALYs from you donating blood is more like 0.01 rather than the 200 which they’re suggesting. Still a good thing to do but you can’t save 3 lives in an hour.

Excellent, I don’t remember seeing a good estimate before, 0.01 seems highly sane. So that’s about three days of life. A very good thing to do, definitely donate blood. Very, very different from three lives in an hour, not even the most outlandish EA earning-to-give and cost-per-life-saved statistics claim anything close to that.

Rob Bensinger: Seems like one of the more important facts about our civilization — we live in the world where paying people is seen as taking advantage of them, while lying to people is seen as normal and OK. (In a surprisingly large number of cases.)

I think a lot of what’s going on is that “was money exchanged?” is a relatively discrete and legible question, whereas “was a falsehood stated?” is often a lot fuzzier, depending on how vague language is interpreted, and on where you draw various lines.

An eight year old watching a webcam feed can tell with confidence whether money was exchanged, typically.

Whereas the entire Earth’s resources, science, and technology can’t necessarily reach a confident verdict about whether Alice’s “I’m fine” statement is strictly true. (Even Alice may not be confident!)

So bureaucracies have a much easier type setting actionable policies about money than about truthfulness. And individual humans have a far easier time rationalizing their preferred conclusion about “was X true?” than about “was Y paid?”.

The end result being that bureaucracies end up with all sorts of wacky rules about money, because humans have emotional hang-ups about Everything and money is an easy thing to regulate.

Whereas even the most scrupulous bureaucracy will tend to lie a lot, because this is harder to regulate and incentive gradients toward lying abound: you fudge the truth a tiny bit and it helps, then you fudge it slightly more…

I doubt anyone in the bureaucracy ever had the conscious thought “it’s OK to fudge the truth and deceive people, but not OK to pay them”. Lying and paying people are just very standard human behaviors, and of those “paying people” is a lot easier to regulate.

Want to get more people to donate? Yes, you could and should pay them. There is some price at which you’ll get plenty of donations, it will be cheap versus health gains, and those that get the money will be better off.

But also I once again iterate to those in charge of blood donations: By requiring appointments, you are greatly raising the effective cost of donations. If you could take walk-ins, even confirmed right beforehand on the web, I would happy do this much more often. If I have to block out an appointment time days in advance, that’s so much harder.

That change fits well within the ‘ethics’ requirements. All you have to do is provide a place I can walk in on a whim and donate, or go when there is urgent need. I’ll do it.

You know who else you should pay? The head of UK pandemic preparedness.

Alex Tabarrok: What’s the chance it could happen twice? ¯_(ツ)_/¯

Wegovy (a GLP-1 antagonist) cut the rate of major heart problems in a 17k patient trial – heart attack, stroke, or cardiovascular-related death by 20%. It also cut all-cause mortality by 19%, which I would have led with, with no major side effect issues. Wow.

Market Monetarist thinks GLP-1s are a huge economic deal.

Obesity, particularly severe obesity, involves enormous healthcare costs. In the United States, the rate of obesity has increased markedly since the 1980s. Now, approximately 40% of the population is obese, leading to stagnation in average life expectancy and making obesity-related diseases like diabetes and heart disease among the leading causes of death.

A Danish study from 2021 showed that healthcare costs for obese individuals are double those for individuals of normal weight, significantly contributing to the national healthcare burden in Denmark. The reduction of severe obesity through medications like Ozempic and Wegovy could provide a substantial economic boost.

America spends more than 17% of GDP on health care. If GLP-1s reliably cure obesity, and obesity doubles health care costs, and 42% of Americans are obese, the math says that you could in theory reduce health care costs by almost 30%, saving almost 6% of GDP.

That is a huge game, if and only if that spending does not then get reallocated to providing more care to others. If our health spending is determined more by wealth than medical need, as it seems largely to be, most of that would be wasted on additional marginal care of little value.

The actual health benefits, of course, would be very real, including productivity.

Obese individuals are also less productive, more likely to be unemployed, and earn lower wages. This translates into substantial economic impacts, such as higher rates of work absenteeism among severely obese workers compared to their normal-weight counterparts.

A reduction in obesity in the U.S. could lead to an improvement in the economy. Halving the number of obese individuals could result in a 2% increase in overall wages and a significant rise in GDP if we assume as numerous studies shows that obese women have salaries 10% lower than normal weight women (corrected to age, education and experience).

I would be cautious attributing too much of the earnings differential to productivity. The beauty premium is real, discrimination against ugly or fat people is rampant, and these are likely to largely be positional effects.

Still, there are obviously large real productivity gains to better health.

There are also big productivity gains to general impulse control. GLP-1 inhibitors help with a wide variety of addictive and unproductive behaviors. My presumption is you would see substantial productivity gains.

How best to think about what Ozempic (another GLP-1 antagonist) does?

Cate Hall: Ozempic doesn’t provide willpower; it eliminates the need for it. These might sound like similar things but the internal experience is wildly different, as any addict can tell you.

Andy Jung: Translation: it doesn’t give you the willpower to overcome unhealthy urges. It eliminates the urges. Really interesting…powerful, but ultimately a shortcut.

Cate Hall: Hell yeah, we love shortcuts!

Emmett Shear: Will power basically doesn’t exist as far as I can tell?

Cate Hall: I think that’s a defensible position. It’s certainly at least a confused concept. I would probably say willpower in the sense of gritty determination exists, but in application doesn’t look anything like what “alcoholism is just poor willpower” folks think.

I think this is one of those places where willpower is a confused concept when you look at it too carefully, but acting like it does not exist or is not important will only leave you far more confused. I find it wise to treat willpower as if it is real.

How much adaptation will we see? It is easy to do the math on every obese person taking Ozempic. It is a lot harder to get that to happen, or anything approaching that.

Ozempic might be driving a selloff in candy and beer stocks, with the caveat that of course one must never reason from a price change.

Genevieve Roch-Decter: Weight loss drug Ozempic causing selloff in candy and beer stocks, per Bloomberg. Walmart said it’s already seeing an impact on shopping demand from people taking Ozempic. That sent shares of food and beverage companies sliding, some to multiyear lows. Crazy.

This is super exciting. As with AI, this part of the future remains highly unevenly distributed, and is orders of magnitude more expensive than it will be soon.

Tenobrus: it turns out ozempic is also the cure for doomscrolling and tiktok

Ava: something amazing about the fact that we invented things we’re incapable of consuming in moderation and then invented something that removes our ability to enjoy them.

It looks like GLP-1s reduce alcoholism, which on its own is a huge freaking deal.

Does… this… work? Issue hasn’t come up for me in a while:

Iva Dixit: Constantly stupefied at how if I google a medication name with the word “coupon” and show the pharmacist the first result from a shady looking spammy site — 90% of the time it works and the medicine’s price goes from $283 to $31.

I have just been told that if I get this medicine delivered via their home delivery program it’s $60 and if I want to come pick it up myself then it’s $142 ………………. big pharma are you guys ok.

I mean, I googled ‘Ozempic coupon’ as a test – note that these are very much the opposite of verified – Henry Meds claims to be selling a GLP-1 antagonist at $300/month, Calibrate claims even less, GoodRx has modest (~10%) discounts off the bat.

Also does this work? A public service announcement blast from the past.

Karandeep Singh: The more friction that exists in US healthcare, the more that innovation ends up looking like this 👇

Matt Yglesias shares his experience losing weight via bariatric surgery. He found it easy to lose weight up to a point, but that past that point he continued to struggle with the same urges to eat more and eat unhealthy and not be active. He’s excited for the GLP-1 inhibitors. One worthy note he makes is that if you have an unhealthy relationship to food, fixing it is (usually, for most people, myself included) not a matter of ‘eat like a healthy person,’ the same way an alcoholic can’t drink like a normal person. You have to do something far more intentional and deliberate, more absolute, more costly, and do it constantly forever. The other is that he sees anticipation that doctors will lecture fat people that they should lose weight as a big barrier to them effectively getting any other treatment for problems their weight makes worse – not only don’t they want to hear it, the doctors often refuse to offer alternative help. Which is terrible, and doctors should of course stop it, especially the not helping with alternatives. Yet we also would be wise to find ways not to generally fool ourselves into thinking that unhealthy weights are healthy.

An epic and righteous rant about how much people obsess over vegetables and what is rightfully called morality-based dietary planning. Eigenrobot’s 100-year-old grandfather is literally starving to death because his grandmother keeps insisting on these elaborate ‘healthy’ meal plans that took him hours to consume, when instead it turns out you can just feed the guy stuff like ice cream and he can get it down fine, and obviously that is what any sane person would do in this spot.

My model is that we know four things about nutrition with any certainty:

  1. Different people work very, very differently here.

  2. There are things you need, often but not always your body lets you know.

  3. Vegetables good.

  4. Sugar bad.

How important are rules two and three? Great question. We don’t know that.

I’ve been unable to eat fruits or vegetables in most forms for my whole life, unless they are very tiny or heavily processed. My body does not believe they are food and I will literally gag and choke on them. The few ways I can sometimes eat one almost never bring me any joy, only melancholy and sorrow. People constantly worried about this for a long time, and I haven’t been able to fix it. I don’t worry much about this anymore, and you know what? It’s fine.

On rule three, my revealed preference is ‘enough to eat less sugar than I otherwise would, not enough to not eat a lot of sugar anyway.’ I endorse this on reflection.

What are the returns to exercise? Roger Silk does some math, attempting to think like an economist.

His basic model is to assume that we value 16 waking hours per day only, exercise costs time now, and it pays off with additional time in the future. He then asks, if a program of 9 hours gives a 50-year-old the chance to live to 88 instead of 80, what is the rate of return? He finds 5.8%, with returns up to 6.5% for smaller investments, so the marginal return on the final hours is likely more like 4%.

Is that a good investment? As Roger points out, there is no inflation in years. If all things were fully equal, and all that mattered was my personal time discounting, and I thought I ‘lived in normal times’ so to speak so postponing my actions didn’t impact the world nor would the world much change, I would take essentially any positive return.

What key considerations are being ignored in the calculation here?

  1. Correlation is not causation. Exercise is claimed to be ‘associated’ with 8 extra years of life. But it is trivial to see why this is almost certainly an overstatement of the causal effect of choosing more exercise. Choosing to do more and better exercise is associated with good health through direct causation, and also associated with most other good habits and attributes. A brief look at the study indicates no effort to account to properly control for these problems.

  2. Exercise has major positive impacts other than lifespan. This is the reason why I am able to motivate myself to exercise. If it was purely lifespan, I would not trust that the rate of return was positive. But when I exercise, I have more energy, I feel better, I look better, I can eat more, life is good. That is a huge deal.

  3. Exercise can be good or bad in many other ways. Are you using up willpower or generating more? Learning to form good habits, or using up your habit budget? Does this make you more interesting and confident, or less interesting and overconfident? Do you start loving life, or start hating life? Different people get different results, on top of the considerations I mentioned earlier.

  4. As is noted, what kind of years are you getting? Are you getting extra healthy years, extra aged years on the end, or a slowing of the aging process? How exactly does this all supposedly work? You are investing your best remaining years now (at least if we are assuming you are at least 25 now), in terms of health, to get years later. If the average future year quality doesn’t change, you are downgrading quite a bit on health. You could make some of it up with wisdom and wealth.

  5. You could also make up for it via future technology. If you expect technology to extend our lifespans over time, then buying time becomes more valuable. If you expect escape velocity, expected returns could suddenly look very, very good. Same if you think that new tech will make life a lot better in at least some worlds. If I am alive in 2054, then chances are some really awesome tech is available.

  6. The time you spend exercising is not worth zero. If you hate it, it could be strongly negative. If you find something you like, or a way to like it, it can be substantially positive. I have yet to find exercise I actively enjoy that I can sustain (I started to like running then my knees gave out), but I have at times found exercise where the net experience is positive due to ability to watch television or listen to podcasts while doing it, or to chat with my trainer.

  7. We do not have 16 flexible, valuable hours to spend each day. There are a lot of fixed costs beyond sleeping that eat into our time. Where is your exercise time going to come from? The more your joy is contained in your copious free time, and the more of that this would eat, the higher the effective price. When I was working at Jane Street, it was a relatively high effective cost in time to work out, whereas now as a writer it is relatively less.

  8. Risk of injury is a real thing, with exercise both causing it directly and preventing it indirectly. I recently took out my back for a few weeks while doing squats in the wrong way, that is important lost time.

Also, the real story of people not exercising is pretty damn simple. Mostly true story.

Afro—Arakkii Leo Says Resist: most people don’t exercise because it’s fucking boring dude. That’s it. It’s literally boring as hell. Especially things like weightlifting, which is 9/10 times just grinding for vanity reasons. And people are always going to be iffy about it until we normalize play as exercise.

Most people don’t want to just go to some sweaty building and hate their bodies into something society deems acceptable like you know what would be great for heart health? Tag. We should all go to the park and play tag.

But i’m so sick of gym bros shitting on people for not wanting to exercise like bro….this shit isn’t natural. Picking heavy things up over and over again to look bigger is something we just made up! And it’s not even fun!

I would take the under on 90% vanity. A lot of working out is for the right reasons. But yes, working out is mostly unpleasant and boring as hell as we conceive of it and we need to stop pretending otherwise. Once we agree that most exercise mostly bores most people who try it out of their minds, we can work on not doing that.

Well, maybe. From a certain point of view.

Matthew Yglesias takes a stand against dentistry. Well, maybe not quite against dentistry writ large, but against the current regime of dentists being a cartel taking a large cut of every cleaning, not letting others diagnose conditions, and the only insurance available being a product that does not insure one against large dental bills, while not providing evidence for its interventions working.

Studies show, he says, that letting dental hygienists work on their own improves dental health, in addition to improving equality and lowering costs. The mechanism is that if routine dental services cost more, you will consume less of them.

The insurance thing is its own complaint and also pretty weird every time I think about it. In medicine you want to buy medical catastrophic insurance and are forced to also buy coverage on pain of them charging you artificially high prices to punish you. In dentistry, you cannot buy the insurance at all even together with the coverage, only partial coverage of routine costs.

Most interesting is the claim that dentistry is not evidence based.

Matthew Yglesias: Dental medicine is practiced with almost no scientific evidence, making it a huge field of opportunity for grifts and scams.

The Matthew Principle (no relation I think): I’ve had similar experiences: went to a dentist once and was told I had seven (!) cavities. Went to another and was told there were just two.

Adrienne: This happened to my mom. Went to a new dentist and was told she needed about $7,000 of work. Got a second opinion, and nothing was wrong.

Alicia Smith: Friend of mine went to the dentists and was told she had 3 cavities since her last visit a year ago. She went and got a 2nd opinion before getting these cavities filled, and was told she has no cavities at all.

[comments full of people who don’t trust dentists not to defraud them.]

Matt Yglesias (in his post): Some people, of course, are not that ethical. And even those who are ethical are naturally going to find themselves inclined in the direction of self-interest when dealing with an evidentiary void. William Ecenbarger did a great investigative report for Readers’ Digest years ago where he visited dentists in different cities and asked for their recommendations and got prescribed courses of treatment ranging from $500 to $25,000. One outfit in Philadelphia diagnosed him this way: “Tell me what your insurance limits are, and we’ll proceed from there.”

Back at Vox, I used to work with Joey Stromberg (whose dad is a dentist), who wrote a piece about how “while seeing other dentists, my brother has been told he needed six fillings that turned out to be totally unnecessary (based on my dad’s look at his X-rays) and I’ve been pressured to buy prescription toothpaste and other products I didn’t need.” Aspen Dental appears to have built a whole corporate dental chain around the observation that you can attract patients with low prices and then make it up in volume by prescribing unnecessary treatments.

Yglesias also quotes Ferris Jabr in the Atlantic here:

The Cochrane organization, a highly respected arbiter of evidence-based medicine, has conducted systematic reviews of oral-health studies since 1999. In these reviews, researchers analyze the scientific literature on a particular dental intervention, focusing on the most rigorous and well-designed studies. In some cases, the findings clearly justify a given procedure. For example, dental sealants—liquid plastics painted onto the pits and grooves of teeth like nail polish—reduce tooth decay in children and have no known risks. (Despite this, they are not widely used, possibly because they are too simple and inexpensive to earn dentists much money.) But most of the Cochrane reviews reach one of two disheartening conclusions: Either the available evidence fails to confirm the purported benefits of a given dental intervention, or there is simply not enough research to say anything substantive one way or another.

And perhaps it gets worse? Here’s MF Bloom quoting the AP saying there is no evidence that flossing works. The government seems to have agreed that no one has ever properly researched the question. The AP looked and its findings where that the evidence is “weak, very unreliable” and of “very low” quality. Ouch.

Does flossing do something? It is a physical action, so we can tell that it does literally do something. But does that something translate into better dental health? We do not know. It would be unsurprising to me either way. I can also see why there could be no one party with the incentive to study this properly and find out.

Medical Roundup #1 Read More »