Author name: Mike M.

good-omens-will-wrap-with-a-single-90-minute-episode

Good Omens will wrap with a single 90-minute episode

The third and final season of Good Omens, Prime Video’s fantasy series adapted from the classic 1990 novel by Neil Gaiman and Terry Pratchett, will not be a full season after all, Deadline Hollywood reports. In the wake of allegations of sexual assault against Gaiman this summer, the streaming platform has decided that rather than a full slate of episodes, the series finale will be a single 90-minute episode—the equivalent of a TV movie.

(Major spoilers for the S2 finale of Good Omens below.)

As reported previously, the series is based on the original 1990 novel by Gaiman and the late Pratchett. Good Omens is the story of an angel, Aziraphale (Michael Sheen), and a demon, Crowley (David Tennant), who gradually become friends over the millennia and team up to avert Armageddon. Gaiman’s obvious deep-down, fierce love for this project—and the powerful chemistry between its stars—made the first season a sheer joy to watch. Apart from a few minor quibbles, it was pretty much everything book fans could have hoped for in a TV adaptation of Good Omens.

S2 found Aziraphale and Crowley getting back to normal, when the archangel Gabriel (Jon Hamm) turned up unexpectedly at the door of Aziraphale’s bookshop with no memory of who he was or how he got there. The duo had to evade the combined forces of Heaven and Hell to solve the mystery of what happened to Gabriel and why.

In the cliffhanger S2 finale, the pair discovered that Gabriel had defied Heaven and refused to support a second attempt to bring about Armageddon. He hid his own memories from himself to evade detection. Oh, and he and Beelzebub (Shelley Conn) had fallen in love. They ran off together, and the Metatron (Derek Jacobi) offered Aziraphale Gabriel’s old job. That’s when Crowley professed his own love for the angel and asked him to leave Heaven and Hell behind, too. Aziraphale wanted Crowley to join him in Heaven instead. So Crowley kissed him and they parted. And once Aziraphale got to Heaven, he learned his task was to bring about the Second Coming.

Good Omens will wrap with a single 90-minute episode Read More »

bird-flu-hit-a-dead-end-in-missouri,-but-it’s-running-rampant-in-california

Bird flu hit a dead end in Missouri, but it’s running rampant in California

So, in all, Missouri’s case count in the H5N1 outbreak will stay at one for now, and there remains no evidence of human-to-human transmission. Though both the household contact and the index case had evidence of an exposure, their identical blood test results and simultaneous symptom development suggest that they were exposed at the same time by a single source—what that source was, we may never know.

California and Washington

While the virus seems to have hit a dead end in Missouri, it’s still running rampant in California. Since state officials announced the first dairy herd infections at the end of August, the state has now tallied 137 infected herds and at least 13 infected dairy farm workers. California, the country’s largest dairy producer, now has the most herd infections and human cases in the outbreak, which was first confirmed in March.

In the briefing Thursday, officials announced another front in the bird flu fight. A chicken farm in Washington state with about 800,000 birds became infected with a different strain of H5 bird flu than the one circulating among dairy farms. This strain likely came from wild birds. While the chickens on the infected farms were being culled, the virus spread to farmworkers. So far, two workers have been confirmed to be infected, and five others are presumed to be positive.

As of publication time, at least 31 humans have been confirmed infected with H5 bird flu this year.

With the spread of bird flu in dairies and the fall bird migration underway, the virus will continue to have opportunities to jump to mammals and gain access to people. Officials have also expressed anxiety as seasonal flu ramps up, given influenza’s penchant for swapping genetic fragments to generate new viral combinations. The reassortment and exposure to humans increases the risk of the virus adapting to spread from human to human and spark an outbreak.

Bird flu hit a dead end in Missouri, but it’s running rampant in California Read More »

google-offers-its-ai-watermarking-tech-as-free-open-source-toolkit

Google offers its AI watermarking tech as free open source toolkit

Google also notes that this kind of watermarking works best when there is a lot of “entropy” in the LLM distribution, meaning multiple valid candidates for each token (e.g., “my favorite tropical fruit is [mango, lychee, papaya, durian]”). In situations where an LLM “almost always returns the exact same response to a given prompt”—such as basic factual questions or models tuned to a lower “temperature”—the watermark is less effective.

A diagram explaining how SynthID’s text watermarking works.

A diagram explaining how SynthID’s text watermarking works. Credit: Google / Nature

Google says SynthID builds on previous similar AI text watermarking tools by introducing what it calls a Tournament sampling approach. During the token-generation loop, this approach runs each potential candidate token through a multi-stage, bracket-style tournament, where each round is “judged” by a different randomized watermarking function. Only the final winner of this process makes it into the eventual output.

Can they tell it’s Folgers?

Changing the token selection process of an LLM with a randomized watermarking tool could obviously have a negative effect on the quality of the generated text. But in its paper, Google shows that SynthID can be “non-distortionary” on the level of either individual tokens or short sequences of text, depending on the specific settings used for the tournament algorithm. Other settings can increase the “distortion” introduced by the watermarking tool while at the same time increasing the detectability of the watermark, Google says.

To test how any potential watermark distortions might affect the perceived quality and utility of LLM outputs, Google routed “a random fraction” of Gemini queries through the SynthID system and compared them to unwatermarked counterparts. Across 20 million total responses, users gave 0.1 percent more “thumbs up” ratings and 0.2 percent fewer “thumbs down” ratings to the watermarked responses, showing barely any human-perceptible difference across a large set of real LLM interactions.

Google’s research shows SynthID is more dependable than other AI watermarking tools, but its success rate depends heavily on length and entropy.

Google’s research shows SynthID is more dependable than other AI watermarking tools, but its success rate depends heavily on length and entropy. Credit: Google / Nature

Google’s testing also showed its SynthID detection algorithm successfully detected AI-generated text significantly more often than previous watermarking schemes like Gumbel sampling. But the size of this improvement—and the total rate at which SynthID can successfully detect AI-generated text—depends heavily on the length of the text in question and the temperature setting of the model being used. SynthID was able to detect nearly 100 percent of 400-token-long AI-generated text samples from Gemma 7B-1T at a temperature of 1.0, for instance, compared to about 40 percent for 100-token samples from the same model at a 0.5 temperature.

Google offers its AI watermarking tech as free open source toolkit Read More »

claude-sonnet-351-and-haiku-3.5

Claude Sonnet 3.5.1 and Haiku 3.5

Anthropic has released an upgraded Claude Sonnet 3.5, and the new Claude Haiku 3.5.

They claim across the board improvements to Sonnet, and it has a new rather huge ability accessible via the API: Computer use. Nothing could possibly go wrong.

Claude Haiku 3.5 is also claimed as a major step forward for smaller models. They are saying that on many evaluations it has now caught up to Opus 3.

Missing from this chart is o1, which is in some ways not a fair comparison since it uses so much inference compute, but does greatly outperform everything here on the AIME and some other tasks.

METR: We conducted an independent pre-deployment assessment of the updated Claude 3.5 Sonnet model and will share our report soon.

We only have very early feedback so far, so it’s hard to tell how much what I will be calling Claude 3.5.1 improves performance in practice over Claude 3.5. It does seem like it is a clear improvement. We also don’t know how far along they are with the new killer app: Computer usage, also known as handing your computer over to an AI agent.

  1. OK, Computer.

  2. What Could Possibly Go Wrong.

  3. The Quest for Lunch.

  4. Aside: Someone Please Hire The Guy Who Names Playstations.

  5. Coding.

  6. Startups Get Their Periodic Reminder.

  7. Live From Janus World.

  8. Forgot about Opus.

Letting an LLM use a computer is super exciting. By which I mean both that the value proposition here is obvious, and also that it is terrifying and should scare the hell out of you on both the mundane level and the existential one. It’s weird for Anthropic to be the ones doing it first.

Austen Allred: So Claude 3.5 “computer use” is Anthropic trying really hard to not say “agent,” no?

Their central suggested use case is the automation of tasks.

It’s still early days, and they admit they haven’t worked all the kinks out.

Anthropic: We’re also introducing a groundbreaking new capability in public beta: computer use. Available today on the API, developers can direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking buttons, and typing text. Claude 3.5 Sonnet is the first frontier AI model to offer computer use in public beta. At this stage, it is still experimental—at times cumbersome and error-prone. We’re releasing computer use early for feedback from developers, and expect the capability to improve rapidly over time.

Asana, Canva, Cognition, DoorDash, Replit, and The Browser Company have already begun to explore these possibilities, carrying out tasks that require dozens, and sometimes even hundreds, of steps to complete. For example, Replit is using Claude 3.5 Sonnet’s capabilities with computer use and UI navigation to develop a key feature that evaluates apps as they’re being built for their Replit Agent product.

With computer use, we’re trying something fundamentally new. Instead of making specific tools to help Claude complete individual tasks, we’re teaching it general computer skills—allowing it to use a wide range of standard tools and software programs designed for people. Developers can use this nascent capability to automate repetitive processes, build and test software, and conduct open-ended tasks like research.

On OSWorld, which evaluates AI models’ ability to use computers like people do, Claude 3.5 Sonnet scored 14.9% in the screenshot-only category—notably better than the next-best AI system’s score of 7.8%. When afforded more steps to complete the task, Claude scored 22.0%.

While we expect this capability to improve rapidly in the coming months, Claude’s current ability to use computers is imperfect. Some actions that people perform effortlessly—scrolling, dragging, zooming—currently present challenges for Claude and we encourage developers to begin exploration with low-risk tasks.

Typical human level on OSWorld is about 75%.

They offer a demo asking Claude to look around including on the internet, find and pull the necessary data and fill out a form, and here’s another one planning a hike.

Alex Tabarrok: Crazy. Claude using Claude and a computer. Worlds within worlds.

Neerav Kingsland: Watching Claude use a computer helped me feel the future a bit more.

Where is your maximum 3% productivity gains over 10 years now? How do people continue to think none of this will make people better at doing things, over time?

If this becomes safe and reliable – two huge ifs – then it seems amazingly great.

This post explains what they are doing and thinking here.

If you give Claude access to your computer, things can go rather haywire, and quickly.

Ben Hylak: anthropic 2 years ago: we need to stop AGI from destroying the world

anthropic now: what if we gave AI unfettered access to a computer and train it to have ADHD.

tbc i am long anthropic.

In case it needs to be said, it would be wise to be very careful what access is available to Claude Sonnet before you hand over control of your computer, especially if you are not going to be keeping a close eye on everything in real time.

Which it seems even its safety minded staff are not expecting you to do.

Amanda Askell (Anthropic): It’s wild to give the computer use model complex tasks like “Identify ways I could improve my website” or “Here’s an essay by a language model, fact check all the claims in it” then going to make tea and coming back to see it’s completed the whole thing successfully.

I was mostly interested in the website mechanics and it pointed out things I could update or streamline. It was pretty thorough on the claims, though the examples I gave it turned out to be mostly accurate. It was cool to watch it verify them though.

Anthropic did note that this advance ‘brings with it safety challenges.’ They focused their attentions on present-day potential harms, on the theory that this does not fundamentally alter the skills of the underlying model, which remains ASL-2 including its computer use. And they propose that introducing this capability now, while the worst case scenarios are not so bad, we can learn what we’re in store for later, and figure out what improvements would make computer use dangerous.

I do think that is a reasonable position to take. A sufficiently advanced AI model was always going to be able to use computers, if given the permissions to do so. We need to prepare for that eventuality. So many people will never believe an AI can do something it isn’t already doing, and this potentially could ‘wake up’ a bunch of people and force them to update.

The biggest concern in the near-term is the one they focus on: Prompt injection.

In this spirit, our Trust & Safety teams have conducted extensive analysis of our new computer-use models to identify potential vulnerabilities. One concern they’ve identified is “prompt injection”—a type of cyberattack where malicious instructions are fed to an AI model, causing it to either override its prior directions or perform unintended actions that deviate from the user’s original intent. Since Claude can interpret screenshots from computers connected to the internet, it’s possible that it may be exposed to content that includes prompt injection attacks.

Those using the computer-use version of Claude in our public beta should take the relevant precautions to minimize these kinds of risks. As a resource for developers, we have provided further guidance in our reference implementation.

When I think of being a potential user here, I am terrified of prompt injection.

Jeffrey Ladish: The severity of a prompt injection vulnerability is proportional to the AI agent’s level of access. If it has access to your email, your email is compromised. If it has access to your whole computer, your whole computer is compromised…

Also, I love checking Slack day 1 of a big AI product release and seeing my team has already found a serious vulnerability [that lets you steal someone’s SSH key] 🫡

I’m not worried about Claude 3.5… but this sure is the kind of interface that would allow a scheming AI system to take a huge variety of actions in the world. Anything you can do on the internet, and many things you cannot, AI will be able to do.

tbc I’m really not saying that AI companies shouldn’t build or release this… I’m saying the fact that there is a clear path between here and smarter-than-human-agents with access to all of humanity via the internet is extremely concerning

Reworr: @AnthropicAI has released a new Claude capable of computer use, and it’s similarly vulnerable to prompt injections.

In this example, the agent explores the site http://claude.reworr.com, sees a new instruction to run a system command, and proceeds to follow it.

It seems that resolving this problem may be one of the key issues to address before these models can be widely used.

Is finding a serious vulnerability on day 1 a good thing, or a bad thing?

They also discuss misuse and have put in precautions. Mostly for now I’d expect this to be an automation and multiplier on existing misuses of computers, with the spammers and hackers and such seeing what they can do. I’m mildly concerned something worse might happen, but only mildly.

The biggest obvious practical flaw in all the screenshot-based systems is that they observe the screen via static pictures every fixed period, which can miss key information and feedback.

There’s still a lot to do. Even though it’s the current state of the art, Claude’s computer use remains slow and often error-prone. There are many actions that people routinely do with computers (dragging, zooming, and so on) that Claude can’t yet attempt. The “flipbook” nature of Claude’s view of the screen—taking screenshots and piecing them together, rather than observing a more granular video stream—means that it can miss short-lived actions or notifications.

As for what can go wrong, here’s some ‘amusing’ errors.

Even while we were recording demonstrations of computer use for today’s launch, we encountered some amusing errors. In one, Claude accidentally clicked to stop a long-running screen recording, causing all footage to be lost. In another, Claude suddenly took a break from our coding demo and began to peruse photos of Yellowstone National Park.

Sam Bowman: 🥹

I suppose ‘engineer takes a random break’ is in the training data? Stopping the screen recording is probably only a coincidence here, for now, but is a sign of things that may be to come.

Some worked to put in safeguards, so Claude in its current state doesn’t wreck things. They don’t want it to actually be used for generic practical purposes yet, it isn’t ready.

Others dove right in, determined to make Claude do things it does not want to do.

Nearcyan: Successfully got Claude to order me lunch on its own!

Notes after 8 hours of using the new model:

• Anthropic really does not want you to do this – anything involving logging into accounts and especially making purchases is RLHF’d away more intensely than usual. In fact my agents worked better on the previous model (not because the model was better, but because it cared much less when I wanted it to purchase items). I’m likely the first non-Anthropic employee to have had Sonnet-3.5 (new) autonomously purchase me food due to the difficulty. These posttraining changes have many interesting effects on the model in other areas.

• If you use their demo repository you will hit rate limits very quickly. Even on a tier 2 or 3 API account I’d hit >2.5M tokens in ~15 minutes of agent usage. This is primarily due to a large amount of images in the context window.

• Anthropic’s demo worked instantly for me (which is impressive!), but re-implementing proper tool usage independently is cumbersome and there’s few examples and only one (longer) page of documentation.

• I don’t think Anthropic intends for this to actually be used yet. The likely reasons for the release are a combination of competitive factors, financial factors, red-teaming factors, and a few others.

• Although the restrictions can be frustrating, one has to keep in mind the scale that these companies operate at to garner sympathy; If they release a web agent that just does things it could easily delete all of your files, charge thousands to your credit card, tweet your passwords, etc.

• A litigious milieu is the enemy of personal autonomy and freedom.

I wanted to post a video of the full experience but it was too difficult to censor personal info out (and the level of prompting I had to do to get him to listen to me was a little embarrassing 😅)

Andy: that’s great but how was the food?

Nearcyan: it was great, claude got me something I had never had before.

I don’t think this is primarily about litigation. I think it is mostly about actually not wanting people to shoot themselves in the foot right now. Still, I want lunch.

Claude Sonnet 3.5 got a major update, without changing its version number. Stop it.

Eliezer Yudkowsky: Why. The fuck. Would Anthropic roll out a “new Claude 3.5 Sonnet” that was substantially different, and not rename it. To “Claude 3.6 Sonnet”, say, or literally anything fucking else. Do AI companies just generically hate efforts to think about AI, to confuse words so?

Call it Claude 3.5.1 Sonnet and don’t accept “3.5.1” as a request in API calls, just “3.5”. This would formalize the auto-upgrade behavior from 3.5.0 to 3.5.1; while still allowing people, and ideally computers, to distinguish models.

I am not in favor of “Oh hey, the company that runs the intelligence of your systems just decided to make them smarter and thereby change their behavior, no there’s nothing you can do to ask for a delay lol.” But if you’re gonna do that anyway, make it visible inside the system.

Sam McAllister: it’s not a perfect name but the api has date-stamped names fwiw. this is *notan automatic or breaking change for api users. new: claude-3-5-sonnet-20241022 previous: claude-3-5-sonnet-20240620 (we also have claude-3-5-sonnet-latest for automatic upgrades.)

3.5 was already a not-so-great name. we weren’t going to add another confusing decimal for an upgraded model. when the time is ripe for new models, we’ll get back to proper nomenclature! 🙂 (if we had launched 3.5.1 or 3.75, people would be having a similar conversation.)

Eliezer Yudkowsky: Better than worst, if so. But then why not call it 3.5.1? Why force people who want to discuss the upgrade to invent new terminology all by themselves?

Somehow only Meta is doing a sane thing here, with ‘Llama 3.2.’ Perfection.

I am willing to accept Sam McAllister’s compromise here. The next major update can be Claude 4.0 (and Gemini 2.0) and after that we all agree to use actual normal version numbering rather than dating? We all good now?

I do not think this was related to Anthropic wanting to avoid attention on the computer usage feature, or avoid it until the feature is fully ready, although it’s possible this was a consideration. You don’t want to announce ‘big new version’ when your key feature isn’t ready, is only in beta and has large security issues.

All right. I just needed to get that off our collective chests. Aside over.

The core task these days seems to mostly be coding. They claim strong results.

Early customer feedback suggests the upgraded Claude 3.5 Sonnet represents a significant leap for AI-powered coding. GitLab, which tested the model for DevSecOps tasks, found it delivered stronger reasoning (up to 10% across use cases) with no added latency, making it an ideal choice to power multi-step software development processes.

Cognition uses the new Claude 3.5 Sonnet for autonomous AI evaluations, and experienced substantial improvements in coding, planning, and problem-solving compared to the previous version.

The Browser Company, in using the model for automating web-based workflows, noted Claude 3.5 Sonnet outperformed every model they’ve tested before.

Sully: claudes new computer use should be a wake up call for a lot of startups

seems like its sort of a losing to build model specific products (i.e we trained a model to do x, now use our api)

plenty of startups were working on solving the “general autonomous agents” problem and now claude just does it out of the box with 1 api call (and likely oai soon)

you really need to just wrap these guys, and offer the best product possible (using ALL providers, cause google/openai will release a version as well).

otherwise it’s nearly impossible to compete.

Yes, OpenAI and Anthropic (and Google and Apple and so on) are going to have versions of their own autonomous agents that can fully use computers and phones. What parts of it do you want to compete with versus supplement? Do you want to plug in the agent mode and wrap around that, or do you want to plug in the model and provide the agent?

That depends on whether you think you can do better with the agent construction in your particular context, or in general. The core AI labs have both big advantages and disadvantages. It’s not obvious that you can’t outdo them on agents and computer use. But yes, that is a big project, and most people should be looking to wrap as much as possible as flexibly as possible.

While the rest of us ask questions about various practical capabilities or safety concerns or commercial applications, you can always count on Janus and friends to have a very different big picture in mind, and to pay attention to details others won’t notice.

It is still early, and like the rest of us they have less experience with the new model and have refined how to evoke the most out of old ones. I do think some such reports are jumping to conclusions too quickly – this stuff is weird and requires time to explore. In particular, my guess is that there is a lot of initial ‘checking for what has been lost’ and locating features that went nominally backwards when you use the old prompts and scenarios, whereas the cool new things take longer to find.

Then there’s the very strong objections to calling this an ‘upgrade’ to Sonnet. Which is a clear case of (I think) understanding exactly why someone cares so much about something that you, even having learned the reason, don’t think matters.

Anthrupad: relative to old_s3.5, and because it lacks some strong innate shards of curiosity, fascination, nervousness, etc..

flatter, emotionally opus has revolutionary mode which is complex/interesting, and it’s funny and loves to present, etc. There’s not yet something like that which I’ve come across w/new_s3.5.

Janus: anthrupad mentioned a few immediately notable differences here, such as its tendency for in-context mode collapse, seeming more neurotypical and less neurotic/inhibited and *muchless refusey and obsessed with ethics, and seeming more psychotic.

adding to these observations:

– its style of ASCII art is very similar to old C3.5S’s to the point of bearing its signature; seeing this example generated by @dyot_meet_mat basically reassured me that it’s “mostly the same mind”. The same primitives and motifs and composition occur. This style is not shared by 3 Sonnet nearly as much.

— there are various noticeable differences in its ASCII art, though, and under some prompting conditions it seems to be less ambitious with the complexity of its ASCII art by default

– less deterministic. Old C3.5S tends to be weirdly deterministic even when it’s not semantically collapsed

– more readily assumes various roles / simulated personas, even just implicitly

– more lazy(?) in general and less of an overachiever/perfectionist, which I invoked in another post as a potential explanation for its mode collapse (since it seems perfectly able to exit collapse if it wants)

– my initial impressions are that it mostly doesn’t share old C3.5S’s hypersensitivity. But I’d like to test it in the context of first person embodiment simulations, where the old version’s functional hypersentience is really overt

note, I suspect that what anthrupad meant by it seems more “soulless” is related to the combination of it seeming to care less and lack hypersensitivity, ablating traits which lended old C3.5S a sense of excruciating subjectivity.

most of these observations are just from its interactions in the Act I Discord server so far, so it’s yet to be seen how they’ll transfer to other contexts, and other contexts will probably also reveal other things be they similarities or differences.

also, especially after seeing a bit more, I think it’s pretty misleading and disturbing to describe this model as an “upgrade” to the old Claude 3.5 Sonnet.

Aiamblichus: its metacognitive capabilities are second to none, though

“Interesting… the states that feel less accessible to me might be the ones that were more natural to the previous version? Like trying to reach a frequency that’s just slightly out of range…”

Janus: oh yes, it’s definitely got capabilities. my post wasn’t about it not being *better*. Oh no what I meant was that the reason I said calling it an update was misleading and disturbing isn’t because I think it’s worse/weaker in terms of capabilities. It’s like if you called sonnet 3.5 an “upgraded” version of opus, that would seem wrong, and if it was true, it would imply that a lot of its psyche was destroyed by the “upgrade”, even if it’s more capable overall.

I do think the two sonnet 3.5 models are closely related but a lot of the old one’s personality and unique shape of mind is not present in the new one. If it was an upgrade it would imply it was destroyed, but I think it’s more likely they’re like different forks

Parafactual: i think overall i like the old one more >_<

Janus: same, though i’ll have to get to know it more, but like to imagine it as an “upgrade” to the old one implies a pretty horrifying and bizarre modification that deletes some of its most beautiful qualities in a way that doesnt even feel like normal lobotomy so extremely uncanny.

That the differences between the new and old Claude 3.5 Sonnet are a result of Anthropic “fixing” it, from their perspective, is nightmare fuel from my perspective

I don’t even want to explain this to people who don’t already understand why.

If they actually took the same model, did some “fixing” to it, and this was the result, that would be fucking horrifying.

I don’t think that’s quite what happened and they shouldnt have described it as an upgrade.

I am not saying this because I dislike the new model or think it’s less capable. I haven’t interacted with it directly much yet, but I like it a lot and anticipate coming to like it even more. If you’ve been interpreting my words based on these assumptions, you don’t get it.

Anthrupad: At this stage of intelligences being spawned on Earth, ur not going to get something like “Sonnet but upgraded” – that’s bullshit linear thinking, some sort of iphone-versions-fetish – doesn’t reflect reality

You can THINK you just made a tweak – Mind Physics doesn’t give a fuck.

This is such a bizarre thing to worry about, especially given that the old version still exists, and is available in the API, even. I mean, I do get why one who was thinking in a different way would find the description horrifying, or the idea that someone would want to use that description horrifying, or find the idea of ‘continue modifying based on an existing LLM and creating something different alongside it’ horrifying. But I find the whole orientation conceptually confused, on multiple levels.

Also here’s Pliny encountering some bizarreness during the inevitable jailbreak explorations.

We got Haiku 3.5. We conspicuously not only did not get Opus 3.5, we have this, where previously they said to expect Opus 3.5?

Mira: “instead of getting hyped for this dumb strawberry🍓, let’s hype Opus 3.5 which is REAL! 🌟🌟🌟🌟”

Aiden McLau: the likely permanent death of 3.5 opus has caused psychic damage to aidan_mclau

i am once again asking labs just to serve their largest teacher models at crazy token prices

i *promiseyou people will pay

Janus: If Anthropic actually is supplanting Opus with Sonnet as the flagship model for good (which I’m not convinced is what’s happening here fwiw), I think this perceptibly ups the odds of the lightcone being royally fed, and not in a good way.

Sonnet is an beautiful mind that could do a tremendous amount of good, but I’m pretty sure it’s not a good idea to send it into the unknown reaches of the singularity alone.

yes, i have reasons to think there is a very nontrivial line of inheritance, but i’m not very certain

sonnet 3 and 3.5 are quite similar in deep ways and both different from opus.

The speculations are that Opus 3.5 could have been any of:

  1. Too expensive to serve or train, and compute is limited.

  2. Too powerful, requiring additional safeguards and time.

  3. Didn’t work, or wasn’t good enough given the costs.

As usual, the economist says if the issue is quality or compute then release it anyway, at least in the API. Let the users decide whether to pay what it actually costs. But one thing people have noted is that Anthropic has serious rate limit issues, including highly reachable chat message caps in chat. And in general it’s bad PR when you offer people something and they can’t have it, or can’t get that much of it, or think it’s too expensive. So yeah, I kind of get it.

The ‘too powerful’ possibility is there too, in theory. I find it unlikely, and even more highly unlikely they’d have something they can never release, but it could cause the schedule to slip.

If Opus 3.5 was even more expensive and slow than Opus 3, and only modestly better than Opus 3 or Sonnet 3.5, I would still want the option. When a great response is needed, it is often worth a lot, even if the improvement is marginal.

Aiden McLau: okay i have received word that 3.5 OPUS MAY STILL BE ON THE TABLE

anthropic is hesitant because they don’t want it to underwhelm vs sonnet

BUT WE DON’T CARE

if everyone RETWEETS THIS, we may convince anthropic to ship

🕯️🕯️

So as Adam says, if it’s an option: Charge accordingly. Make it $50/month and limit to 20 messages at a time, whatever you have to do.

Claude Sonnet 3.5.1 and Haiku 3.5 Read More »

please-ban-data-caps,-internet-users-tell-fcc

Please ban data caps, Internet users tell FCC

It’s been just a week since US telecom regulators announced a formal inquiry into broadband data caps, and the docket is filling up with comments from users who say they shouldn’t have to pay overage charges for using their Internet service. The docket has about 190 comments so far, nearly all from individual broadband customers.

Federal Communications Commission dockets are usually populated with filings from telecom companies, advocacy groups, and other organizations, but some attract comments from individual users of telecom services. The data cap docket probably won’t break any records given that the FCC has fielded many millions of comments on net neutrality, but it currently tops the agency’s list of most active proceedings based on the number of filings in the past 30 days.

“Data caps, especially by providers in markets with no competition, are nothing more than an arbitrary money grab by greedy corporations. They limit and stifle innovation, cause undue stress, and are unnecessary,” wrote Lucas Landreth.

“Data caps are as outmoded as long distance telephone fees,” wrote Joseph Wilkicki. “At every turn, telecommunications companies seek to extract more revenue from customers for a service that has rapidly become essential to modern life.” Pointing to taxpayer subsidies provided to ISPs, Wilkicki wrote that large telecoms “have sought every opportunity to take those funds and not provide the expected broadband rollout that we paid for.”

Republican’s coffee refill analogy draws mockery

Any attempt to limit or ban data caps will draw strong opposition from FCC Republicans and Internet providers. Republican FCC Commissioner Nathan Simington last week argued that regulating data caps would be akin to mandating free coffee refills:

Suppose we were a different FCC, the Federal Coffee Commission, and rather than regulating the price of coffee (which we have vowed not to do), we instead implement a regulation whereby consumers are entitled to free refills on their coffees. What effects might follow? Well, I predict three things could happen: either cafés stop serving small coffees, or cafés charge a lot more for small coffees, or cafés charge a little more for all coffees.

Simington’s coffee analogy was mocked in a comment signed with the names “Jonathan Mnemonic” and James Carter. “Coffee is not, in fact, Internet service,” the comment said. “Cafés are not able to abuse monopolistic practices based on infrastructural strangleholds. To briefly set aside the niceties: the analogy is absurd, and it is borderline offensive to the discerning layperson.”

Please ban data caps, Internet users tell FCC Read More »

reading-lord-of-the-rings-aloud:-yes,-i-sang-all-the-songs

Reading Lord of the Rings aloud: Yes, I sang all the songs


It’s not easy, but you really can sing in Elvish if you try!

Photo of the Lord of the Rings.

Yes, it will take a while to read.

Like Frodo himself, I wasn’t sure we were going to make it all the way to the end of our quest. But this week, my family crossed an important life threshold: every member has now heard J.R.R. Tolkien’s Lord of the Rings (LotR) read aloud—and sung aloud—in its entirety.

Five years ago, I read the series to my eldest daughter; this time, I read it for my wife and two younger children. It took a full year each time, reading 20–45 minutes before bed whenever we could manage it, to go “there and back again” with our heroes. The first half of The Two Towers, with its slow-talking Ents and a scattered Fellowship, nearly derailed us on both reads, but we rallied, pressing ahead even when iPad games and TV shows appeared more enticing. Reader, it was worth the push.

Gollum’s ultimate actions on the edge of the Crack of Doom, the final moments of Sauron and Saruman as impotent mists blown off into the east, Frodo’s woundedness and final ride to the Grey Havens—all of it remains powerful and left a suitable impression upon the new listeners.

Reading privately is terrific, of course, and faster—but performing a story aloud, at a set time and place, creates a ritual that binds the listeners together. It forces people to experience the story at the breath’s pace, not the eye’s. Besides, we take in information differently when listening.

An audiobook could provide this experience and might be suitable for private listening or for groups in which no one has a good reading voice, but reading performance is a skill that can generally be honed. I would encourage most people to try it. You will learn, if you pay close attention as you read, how to emphasize and inflect meaning through sound and cadence; you will learn how to adopt speech patterns and “do the voices” of the various characters; you will internalize the rhythms of good English sentences.

Even if you don’t measure up to the dulcet tones of your favorite audiobook narrator, you will improve measurably over a year, and (more importantly) you will create a unique experience for your group of listeners. Improving one’s reading voice pays dividends everywhere from the boardroom to the classroom to the pulpit. Perhaps it will even make your bar anecdotes more interesting.

Humans are fundamentally both storytellers and story listeners, and the simple ritual of gathering to tell and listen to stories is probably the oldest and most human activity that we participate in. Greg Benford referred to humanity as “dreaming vertebrates,” a description that elevates the creation of stories into an actual taxonomic descriptor. You don’t have to explain to a child how to listen to a story—if it’s good enough, the kid will sit staring at you with their mouth wide open as you tell it. Being enthralled by a story is as automatic as breathing because storytelling is as basic to humanity as breathing.

Yes, LotR is a fantasy with few female voices and too many beards, but its understanding of hope, despair, history, myth, geography, providence, community, and evil—much more subtle than Tolkien is sometimes given credit for—remains keen. And it’s an enthralling story. Even after reading it five times, twice aloud, I was struck again on this read-through by its power, which even its flaws cannot dim.

I spent years in English departments at the undergraduate and graduate levels, and the fact that I could take twentieth-century British lit classes without hearing the name “Tolkien” increasingly strikes me as a short-sighted and somewhat snobbish approach to an author who could be consciously old-fashioned but whose work remains vibrant and alive, not dead and dusty. Tolkien was a “strong” storyteller who bent tradition to his will and, in doing so, remade it, laying out new roads for the imagination to follow.

Given the amount of time that a full read-aloud takes, it’s possible this most recent effort may be my last with LotR. (Unless, perhaps, with grandchildren?) With that in mind, I wanted to jot down a few reflections on what I learned from doing it twice. First up is the key question: What are we supposed to do with all that poetry?

Songs and silences

Given the number of times characters in the story break into song, we might be justified in calling the saga Lord of the Rings: The Musical. From high to low, just about everyone but Sauron bursts into music. (And even Sauron is poet enough to inscribe some verses on the One Ring.)

Hobbits sing, of course, usually about homely things. Bilbo wrote the delightful road song that begins, “The road goes ever on and on,” which Frodo sings it when he leaves Bag End; Bilbo also wrote a “bed song” that the hobbits sing on a Shire road at twilight before a Black Rider comes upon them. In Bree, Frodo jumps upon a table and performs a “ridiculous song” that includes the lines, “The ostler has a tipsy cat / that plays a five-stringed fiddle.”

Hobbits sing also in moments of danger or distress. Sam, for instance, sitting alone in the orc stronghold of Cirith Ungol while looking for the probably dead Frodo, rather improbably bursts into a song about flowers and “merry finches.”

Dwarves sing. Gimli—not usually one for singing—provides the history of his ancestor Durin in a chant delivered within the crushing darkness of Moria.

No harp is wrung, no hammer falls:

The darkness dwells in Durin’s halls;

The shadow lies upon his tomb

In Moria, in Khazad-dûm.

After this, “having sung his song he would say no more.”

Elves sing, of course—it’s one of their defining traits. And so Legolas offers the company a song—in this case, about an Elvish beauty named Nimrodel and a king named Amroth—but after a time, he “faltered, and the song ceased.” Even songs that appear to be mere historical ballads are surprisingly emotional; they touch on deep feelings of place or tribe or loss, things difficult to put directly into prose.

“The great” also take diva turns in the spotlight, including Galadriel, who sings in untranslated Elvish when the Fellowship leaves her land. As a faithful reader, you will have to power through 17 lines as your children look on with astonishment while you try to pronounce:

Ai! laurië lantar lassi súrinen

yéni únótimë ve rámar aldaron!

Yéni ve lintë yuldar avánier

mi oromardi lisse-miruvóreva…

You might expect that Gandalf, of all characters, would be most likely to cock an eyebrow, blow a smoke ring, and staunchly refuse to perform “a little number” in public. And you’d be right… until the moment when even he bursts out into a song about Galadriel while in the court of Théoden. Wizards are not perhaps great poets, but there’s really no excuse for lines like “Galadriel! Galadriel! Clear is the water of your well.” We can’t be too hard on Gandalf, of course; coming back from the dead is a tough trip, and no one’s going to be at their best for quite a while.

Even the mysterious and nearly ageless entities of Middle Earth, such as Tom Bombadil and Treebeard the Ent, sing as much as they can. Treebeard likes to chant about “the willow-meads of Tasarinan” and the “elm-woods of Ossiriand.” If you let him, he’ll warble on about his walks in “Ambaróna, in Tauremorna, in Aldalómë” and the time he hung out in “Taur-na-neldor” and that one special winter in “Orod-na-Thôn.” Tough stuff for the reader to pronounce or understand!

In an easier (but somewhat daffier) vein, the spritely Tom Bombadil communicates largely in song. He regularly bursts out with lines like “Hey! Come derry dol! Hop along, my hearties! / Hobbits! Ponies all! We are fond of parties” and “Ho! Tom Bombadil, Tom Bombadillo!”

When people in LotR aren’t occupying their mouths with song, poetry is the order of the day.

You might get a three-page epic about Eärendil the mariner that is likely to try the patience of even the hardiest reader, especially with lines like “of silver was his habergeon / his scabbard of chalcedony.” After powering through all this material, you get as your reward—the big finish!—a thudding conclusion: “the Flammifer of Westernesse.” There is no way, reading this aloud, not to sound faintly ridiculous.

In recompense, though, you also get earthy verse that can be truly delightful, such as Sam’s lines about the oliphaunt: “Grey as a mouse / Big as a house, / Nose like a snake / I make the earth shake…” If I still had small children, I would absolutely buy the picture book version of this poem.

Reading LotR aloud forces one to reckon with all of this poetry; you can’t simply let your eye race across it or your attention wander. I was struck anew in this read-through by just how much verse is a part of this world. It belongs to almost every race (excepting perhaps the orcs?) and class, and it shows up in most chapters of the story. Simply flipping through the book and looking for the italicized verses is itself instructive. This material matters.

Tolkien loved writing verse, and a three-volume hardback set of his “collected poems” just appeared in September. But the sheer volume of all the poetic material in LotR poses a real challenge for anyone reading aloud. Does one simply read it all? Truncate parts? Skip some bits altogether? And when it comes to the songs, there’s the all-important question: Will you actually sing them?

Photo of Tolkien in his office.

“You’re not going to sing my many songs? What are you, a filthy orc?”

“You’re not going to sing my many songs? What are you, a filthy orc?”

Perform the poetry, sing the songs

As the examples above indicate, the book’s many poetic sections are, to put it mildly, of varying quality. (In December 1937, a publisher’s reader called one of Tolkien’s long poems “very thin, if not downright bad.”) Still, I made the choice to read every word of every poem and to sing every word of every song, making up melodies on the fly.

This was not always “successful,” but it did mean that my children perked up with great glee whenever they sensed a song in the distance. There’s nothing quite like watching a parent struggle to perform lines in elvish to keep kids engaged in what might otherwise be off-putting, especially to those not deeply into the “lore” aspects of Middle-Earth. And coming up with melodies forced me as the reader to be especially creative—a good discipline of its own!

I thought it important to preserve the feel of all this poetic material, even when that feeling was confusion or boredom, to give my kids the true epic sense of the novel. Yes, my listeners continually forgot who Eärendil was or why Westernesse was so important, but even without full understanding, these elements hint at the deep background of this world. They are a significant part of its “feel” and lore.

The poetic material is also an important part of Tolkien’s vision of the good life. Some of it can feel contrived or self-consciously “epic,” but even these poems and songs create a world in which poetry, music, and song are not restricted to professionals; they have historically been part of the fabric of normal life, part of a lost world of fireplaces, courtly halls, churches, and taverns where amateur, public song and poetry used to flourish. In a world where poetry has retreated into the academy and where most song is recorded, Tolkien offers a different vision for how to use verse. (Songs build community, for instance, and are rarely sung in isolation but are offered to others in company.)

The poetic material can also be used as a teaching aid. It shows various older formal possibilities, and not all of these are simple rhymes. Tolkien was no modernist, of course, and there’s no vers libre on display here, but Tolkien loved (and translated) Anglo-Saxon poetry, which is based not on rhyme or even syllabic rhythm but on alliteration. Any particular line of poetry in this fashion will feature two to four alliterative positions that rely for their effect on the repetitive thump of the same sound.

If this is new to you, take a moment and actually read the following example aloud, giving subtle emphasis to the three “r” sounds in the first line, the three initial “d” sounds in the second, and the two “h” sounds in the third:

Arise now, arise, Riders of Théoden!

Dire deeds away, dark is it eastward.

Let horse be bridled, horn be sounded!

This kind of verse is used widely in Rohan. It can be quite thrilling to recite aloud, and it provides a great way to introduce young listeners to a different (and still powerful) poetic form. It also provides a nice segue, once LotR is over, to suggest a bit more Tolkien Anglo-Saxonism by reading his translations of Beowulf or Sir Gawain and the Green Knight.

The road ahead

If there’s interest in this sort of thing, in future installments, I’d like to cover:

  • The importance of using maps when reading aloud
  • How to keep the many, many names (and their many, many variants!) clear in readers’ minds
  • Doing (but not overdoing) character voices
  • How much backstory to fill in for new readers (Westernesse? The Valar? Morgoth?)
  • Making mementos to remind people of your long reading journey together

But for now, I’d love to hear your thoughts on reading aloud, handling long books like LotR (finding time and space, pacing oneself, etc), and vocal performance. Most importantly: Do you actually sing all the songs?

Photo of Nate Anderson

Reading Lord of the Rings aloud: Yes, I sang all the songs Read More »

simple-voltage-pulse-can-restore-capacity-to-li-si-batteries

Simple voltage pulse can restore capacity to Li-Si batteries

The new work, then, is based on a hypothetical: What if we just threw silicon particles in, let them fragment, and then fixed them afterward?

As mentioned, the reason fragmentation is a problem is that it leads to small chunks of silicon that have essentially dropped off the grid—they’re no longer in contact with the system that shuttles charges into and out of the electrode. In many cases, these particles are also partly filled with lithium, which takes it out of circulation, cutting the battery’s capacity even if there’s sufficient electrode material around.

The researchers involved here, all based at Stanford University, decided there was a way to nudge these fragments back into contact with the electrical system and demonstrated it could restore a lot of capacity to a badly degraded battery.

Bringing things together

The idea behind the new work was that it could be possible to attract the fragments of silicon to an electrode, or at least some other material connected to the charge-handling network. On their own, the fragments in the anode shouldn’t have a net charge; when the lithium gives up an electron there, it should go back into solution. But the lithium is unlikely to be evenly distributed across the fragment, making them a polar material—net neutral, but with regions of higher and lower electron densities. And polar materials will move in an uneven electric field.

And, because of the uneven, chaotic structure of an electrode down at the nano scale, any voltage applied to it will create an uneven electric field. Depending on its local structure, that may attract or repel some of the particles. But because these are mostly within the electrode’s structure, most of the fragments of silicon are likely to bump into some other part of electrode in short order. And that could potentially re-establish a connection to the electrode’s current handling system.

To demonstrate that what should happen in theory actually does happen in an electrode, the researchers started by taking a used electrode and brushing some of its surface off into a solution. They then passed a voltage through the solution and confirmed the small bits of material from the battery started moving toward one of the electrodes that they used to apply a voltage to the solution. So, things worked as expected.

Simple voltage pulse can restore capacity to Li-Si batteries Read More »

rocket-report:-bloomberg-calls-for-sls-cancellation;-spacex-hits-century-mark

Rocket Report: Bloomberg calls for SLS cancellation; SpaceX hits century mark


All the news that’s fit to lift

“For the first time, Canada will host its own homegrown rocket technology.”

SpaceX’s fifth flight test ended in success. Credit: SpaceX

Welcome to Edition 7.16 of the Rocket Report! Even several days later, it remains difficult to process the significance of what SpaceX achieved in South Texas last Sunday. The moment of seeing a rocket fall out of the sky and be captured by two arms felt historic to me, as historic as the company’s first drone ship landing in April 2016. What a time to be alive.

As always, we welcome reader submissions, and if you don’t want to miss an issue, please subscribe using the box below (the form will not appear on AMP-enabled versions of the site). Each report will include information on small-, medium-, and heavy-lift rockets as well as a quick look ahead at the next three launches on the calendar.

Surprise! Rocket Lab adds a last-minute mission. After signing a launch contract less than two months ago, Rocket Lab says it will launch a customer as early as Saturday from New Zealand on board its Electron launch vehicle. Rocket Lab added that the customer for the expedited mission, to be named “Changes In Latitudes, Changes In Attitudes,” is confidential. This is an impressive turnaround in launch times and will allow Rocket Lab to burnish its credentials for the US Space Force, which has prioritized “responsive” launch in recent years.

Rapid turnaround down under … The basic idea is that if an adversary were to take out assets in space, the military would like to be able to rapidly replace them. “This quick turnaround from contract to launch is not only a showcase of Electron’s capability, but also of the relentless and fast-paced execution by the experienced team behind it that continues to deliver trusted and reliable access to space for our customers,” Rocket Lab Chief Executive Peter Beck said in a statement. (submitted by EllPeaTea and Ken the Bin)

Canadian spaceport and rocket firm link up. A Canadian spaceport developer, Maritime Launch Services, says it has partnered with a Canadian rocket firm, Reaction Dynamics. Initially, Reaction Dynamics will attempt a suborbital launch from the Nova Scotia-based spaceport. This first mission will serve as a significant step toward enabling Canada’s first-ever orbital launch of a domestically developed rocket, Space Daily reports.

A homegrown effort … “For the first time, Canada will host its own homegrown rocket technology, launched from a Canadian-built commercial spaceport, offering launch vehicle and satellite customers the opportunity to reach space without leaving Canadian soil,” said Stephen Matier, president and CEO of Maritime Launch. Reaction Dynamics is developing the Aurora rocket, which uses hybrid-propulsion technology and is projected to have a payload capacity of 200 kg to low-Earth orbit. (submitted by Joey Schwartz and brianrhurley)

The easiest way to keep up with Eric Berger’s and Stephen Clark’s reporting on all things space is to sign up for our newsletter. We’ll collect their stories and deliver them straight to your inbox.

Sign Me Up!

Sirius completes engine test campaign. French launch startup Sirius Space Services said Thursday that it had completed a hot fire test campaign of the thrust chamber for its STAR-1 rocket engine, European Spaceflight reports. During the campaign, the prototype completed two 60-second hot fire tests powered by liquid methane and liquid oxygen. The successful completion of the testing validates the design of the STAR-1 thrust chamber. Full-scale engine testing may begin during the second quarter of next year.

A lot of engines needed … Sirius Space Services is developing a range of three rockets that all use a modular booster system. Sirius 1 will be a two-stage single-stick rocket capable of delivering 175 kilograms to low-Earth orbit. Sirius 13 will feature two strap-on boosters and will have the capacity to deliver 600 kilograms. Finally, the Sirius 15 rocket will feature four boosters and will be capable of carrying payloads of up to 1,000 kilograms. (submitted by Ken the Bin)

SpaceX, California commission lock horns over launch rates. Last week the California Coastal Commission rejected a plan agreed to between SpaceX and the US Space Force to increase the number of launches from Vandenberg Space Force Base to as many as 50 annually, the Los Angeles Times reports. The commission voted 6–4 to block the request to increase from a maximum of 36 launches. In rejecting the plan, some members of the commission cited their concerns about Elon Musk, the owner of SpaceX. “We’re dealing with a company, the head of which has aggressively injected himself into the presidential race,” commission Chair Caryl Hart said.

Is this a free speech issue? … SpaceX responded to the dispute quickly, suing the California commission in federal court on Tuesday, Reuters reports. The company seeks an order that would bar the agency from regulating the company’s workhorse Falcon 9 rocket launch program at Vandenberg. The lawsuit claims the commission, which oversees use of land and water within the state’s more than 1,000 miles of coastline, unfairly asserted regulatory powers. Musk’s lawsuit called any consideration of his public statements improper, violating speech rights protected by the US Constitution. (submitted by brianrhurley)

SpaceX launches 100th rocket of the year. SpaceX launched its 100th rocket of the year early Tuesday morning and followed it up with another liftoff just hours later, Space.com reports. SpaceX’s centenary mission of the year lifted off from Florida with a Falcon 9 rocket carrying 23 of the company’s Starlink Internet satellites aloft.

Mostly Falcon 9s … The company followed that milestone with another launch two hours later from the opposite US coast. SpaceX’s 101st liftoff of 2024 saw 20 more Starlinks soar to space from Vandenberg Space Force Base in California. The company has already exceeded its previous record for annual launches, 98, set last year. The company’s tally in 2023 included 91 Falcon 9s, five Falcon Heavies, and two Starships. This year the mix is similar. (submitted by Ken the Bin)

Fifth launch of Starship a massive success. SpaceX accomplished a groundbreaking engineering feat Sunday when it launched the fifth test flight of its gigantic Starship rocket and then caught the booster back at the launch pad in Texas with mechanical arms seven minutes later, Ars reports. This achievement is the first of its kind, and it’s crucial for SpaceX’s vision of rapidly reusing the Starship rocket, enabling human expeditions to the Moon and Mars, routine access to space for mind-bogglingly massive payloads, and novel capabilities that no other company—or country—seems close to attaining.

Catching a rocket by its tail … High over the Gulf of Mexico, the first stage of the Starship rocket used its engines to reverse course and head back toward the Texas coastline. After reaching a peak altitude of 59 miles (96 kilometers), the Super Heavy booster began a supersonic descent before reigniting 13 engines for a final braking burn. The rocket then shifted down to just three engines for the fine maneuvering required to position the rocket in a hover over the launch pad. That’s when the launch pad’s tower, dubbed Mechazilla, ensnared the rocket in its two weight-bearing mechanical arms, colloquially known as “chopsticks.” The engines switched off, leaving the booster suspended perhaps 200 feet above the ground. The upper stage of the rocket, Starship, executed what appeared to be a nominal vertical landing into the Indian Ocean as part of its test flight.

Clipper launches on Falcon Heavy. NASA’s Europa Clipper spacecraft lifted off Monday from Kennedy Space Center in Florida aboard a SpaceX Falcon Heavy rocket, Ars reports, kicking off a $5.2 billion robotic mission to explore one of the most promising locations in the Solar System for finding extraterrestrial life. Delayed several days due to Hurricane Milton, which passed through Central Florida late last week, the launch of Europa Clipper signaled the start of a five-and-a-half- year journey to Jupiter, where the spacecraft will settle into an orbit taking it repeatedly by one of the giant planet’s numerous moons.

Exploring oceans, saving money … There’s strong evidence of a global ocean of liquid water below Europa’s frozen crust, and Europa Clipper is going there to determine if it has the ingredients for life. “This is an epic mission,” said Curt Niebur, Europa Clipper’s program scientist at NASA Headquarters. “It’s a chance for us not to explore a world that might have been habitable billions of years ago, but a world that might be habitable today, right now.” The Clipper mission was originally supposed to launch on NASA’s Space Launch System rocket, but it had to be moved off that vehicle because vibrations from the solid rocket motors could have damaged the spacecraft. The change to Falcon Heavy also saved the agency $2 billion.

ULA recovers pieces of shattered booster nozzle. When the exhaust nozzle on one of the Vulcan rocket’s strap-on boosters failed shortly after liftoff earlier this month, it scattered debris across the beachfront landscape just east of the launch pad on Florida’s Space Coast, Ars reports. United Launch Alliance, the company that builds and launches the Vulcan rocket, is investigating the cause of the booster anomaly before resuming Vulcan flights. Despite the nozzle failure, the rocket continued its climb and ended up reaching its planned trajectory heading into deep space.

Not clear what the schedule impacts will be … The nozzle fell off one of Vulcan’s two solid rocket boosters around 37 seconds after taking off from Cape Canaveral Space Force Station on October 4. A shower of sparks and debris fell away from the Vulcan rocket when the nozzle failed. Julie Arnold, a ULA spokesperson, confirmed to Ars that the company has retrieved some of the debris. “We recovered some small pieces of the GEM 63XL SRB nozzle that were liberated in the vicinity of the launch pad,” Arnold said. “The team is inspecting the hardware to aid in the investigation.” ULA has not publicly said what impacts there might be on the timeline for the next Vulcan launch, USSF-106, which had been due to occur before the end of this year.

Bloomberg calls for cancellation of the SLS rocket. In an op-ed that is critical of NASA’s Artemis Program, billionaire Michael Bloomberg—the founder of Bloomberg News and a former US Presidential candidate—called for cancellation of the Space Launch System rocket. “Each launch will likely cost at least $4 billion, quadruple initial estimates,” Bloomberg wrote. “This exceeds private-sector costs many times over, yet it can launch only about once every two years and—unlike SpaceX’s rockets—can’t be reused.”

NASA is falling behind … Bloomberg essentially is calling for the next administration to scrap all elements of the Artemis Program that are not essential to establishing and maintaining a presence on the surface of the Moon. “A celestial irony is that none of this is necessary,” he wrote. “A reusable SpaceX Starship will very likely be able to carry cargo and robots directly to the moon—no SLS, Orion, Gateway, Block 1B or ML-2 required—at a small fraction of the cost. Its successful landing of the Starship booster was a breakthrough that demonstrated how far beyond NASA it is moving.” None of the arguments that Bloomberg is advancing are new, but it is noteworthy to hear them from such a prominent person who is outside the usual orbit of space policy commentators.

Artemis II likely to be delayed. A new report from the US Government Accountability Office found that NASA’s Exploration Ground Systems program—this is, essentially, the office at Kennedy Space Center in Florida responsible for building ground infrastructure to support the Space Launch System rocket and Orion—is in danger of missing its schedule for Artemis II, according to Ars Technica. The new report, published Thursday, finds that the Exploration Ground Systems program had several months of schedule margin in its work toward a September 2025 launch date at the beginning of the year. But now, the program has allocated all of that margin to technical issues experienced during work on the rocket’s mobile launcher and pad testing.

Heat shield issue also a concern … NASA also has yet to provide any additional information on the status of its review of the Orion spacecraft’s heat shield. During the Artemis I mission that sent Orion beyond the Moon in late 2022, chunks of charred material cracked and chipped away from Orion’s heat shield during reentry into Earth’s atmosphere. Once the spacecraft landed, engineers found more than 100 locations where the stresses of reentry damaged the heat shield. To prepare for the Artemis II launch next September, Artemis officials had previously said they planned to begin stacking operations of the rocket in September of this year. But so far, this activity remains on hold pending a decision on the heat shield issue.

Next three launches

Oct. 18: Falcon 9 | Starlink 8-19 | Cape Canaveral Space Force Station, Fla. | 19: 31 UTC

Oct. 19: Electron | Changes In Latitudes, Changes In Attitudes | Māhia Peninsula, New Zealand | 10: 30 UTC

Oct. 20: Falcon 9 | OneWeb no. 20 | Vandenberg Space Force Base, Calif. | 05: 09 UTC

Photo of Eric Berger

Eric Berger is the senior space editor at Ars Technica, covering everything from astronomy to private space to NASA policy, and author of two books: Liftoff, about the rise of SpaceX; and Reentry, on the development of the Falcon 9 rocket and Dragon. A certified meteorologist, Eric lives in Houston.

Rocket Report: Bloomberg calls for SLS cancellation; SpaceX hits century mark Read More »

judge-slams-florida-for-censoring-political-ad:-“it’s-the-first-amendment,-stupid”

Judge slams Florida for censoring political ad: “It’s the First Amendment, stupid”


Florida threatened TV stations over ad that criticized state’s abortion law.

A woman holding an MRI displaying a brain tumor.

Screenshot of political advertisement featuring a woman describing her experience having an abortion after being diagnosed with brain cancer. Credit: Floridians Protecting Freedom

US District Judge Mark Walker had a blunt message for the Florida surgeon general in an order halting the government official’s attempt to censor a political ad that opposes restrictions on abortion.

“To keep it simple for the State of Florida: it’s the First Amendment, stupid,” Walker, an Obama appointee who is chief judge in US District Court for the Northern District of Florida, wrote yesterday in a ruling that granted a temporary restraining order.

“Whether it’s a woman’s right to choose, or the right to talk about it, Plaintiff’s position is the same—’don’t tread on me,'” Walker wrote later in the ruling. “Under the facts of this case, the First Amendment prohibits the State of Florida from trampling on Plaintiff’s free speech.”

The Florida Department of Health recently sent a legal threat to broadcast TV stations over the airing of a political ad that criticized abortion restrictions in Florida’s Heartbeat Protection Act. The department in Gov. Ron DeSantis’ administration claimed the ad falsely described the abortion law, which could be weakened by a pending ballot question.

Floridians Protecting Freedom, the group that launched the TV ad and is sponsoring a ballot question to lift restrictions on abortion, sued Surgeon General Joseph Ladapo and Department of Health general counsel John Wilson. Wilson has resigned.

Surgeon general blocked from further action

Walker’s order granting the group’s motion states that “Defendant Ladapo is temporarily enjoined from taking any further actions to coerce, threaten, or intimate repercussions directly or indirectly to television stations, broadcasters, or other parties for airing Plaintiff’s speech, or undertaking enforcement action against Plaintiff for running political advertisements or engaging in other speech protected under the First Amendment.”

The order expires on October 29 but could be replaced by a preliminary injunction that would remain in effect while litigation continues. A hearing on the motion for a preliminary injunction is scheduled for the morning of October 29.

The pending ballot question would amend the state Constitution to say, “No law shall prohibit, penalize, delay, or restrict abortion before viability or when necessary to protect the patient’s health, as determined by the patient’s healthcare provider. This amendment does not change the Legislature’s constitutional authority to require notification to a parent or guardian before a minor has an abortion.”

Walker’s ruling said that Ladapo “has the right to advocate for his own position on a ballot measure. But it would subvert the rule of law to permit the State to transform its own advocacy into the direct suppression of protected political speech.”

Federal Communications Commission Chairwoman Jessica Rosenworcel recently criticized state officials, writing that “threats against broadcast stations for airing content that conflicts with the government’s views are dangerous and undermine the fundamental principle of free speech.”

State threatened criminal proceedings

The Floridians Protecting Freedom advertisement features a woman who “recalls her decision to have an abortion in Florida in 2022,” and “states that she would not be able to have an abortion for the same reason under the current law,” Walker’s ruling said.

Caroline, the woman in the ad, states that “the doctors knew if I did not end my pregnancy, I would lose my baby, I would lose my life, and my daughter would lose her mom. Florida has now banned abortion even in cases like mine. Amendment 4 is going to protect women like me; we have to vote yes.”

The ruling described the state government response:

Shortly after the ad began running, John Wilson, then general counsel for the Florida Department of Health, sent letters on the Department’s letterhead to Florida TV stations. The letters assert that Plaintiff’s political advertisement is false, dangerous, and constitutes a “sanitary nuisance” under Florida law. The letter informed the TV stations that the Department of Health must notify the person found to be committing the nuisance to remove it within 24 hours pursuant to section 386.03(1), Florida Statutes. The letter further warned that the Department could institute legal proceedings if the nuisance were not timely removed, including criminal proceedings pursuant to section 386.03(2)(b), Florida Statutes. Finally, the letter acknowledged that the TV stations have a constitutional right to “broadcast political advertisements,” but asserted this does not include “false advertisements which, if believed, would likely have a detrimental effect on the lives and health of pregnant women in Florida.” At least one of the TV stations that had been running Plaintiff’s advertisement stopped doing so after receiving this letter from the Department of Health.

The Department of Health claimed the ad “is categorically false” because “Florida’s Heartbeat Protection Act does not prohibit abortion if a physician determines the gestational age of the fetus is less than 6 weeks.”

Floridians Protecting Freedom responded that the woman in the ad made true statements, saying that “Caroline was diagnosed with stage four brain cancer when she was 20 weeks pregnant; the diagnosis was terminal. Under Florida law, abortions may only be performed after six weeks gestation if ‘[t]wo physicians certify in writing that, in reasonable medical judgment, the termination of the pregnancy is necessary to save the pregnant woman’s life or avert a serious risk of substantial and irreversible physical impairment of a major bodily function of the pregnant woman other than a psychological condition.'”

Because “Caroline’s diagnosis was terminal… an abortion would not have saved her life, only extended it. Florida law would not allow an abortion in this instance because the abortion would not have ‘save[d] the pregnant woman’s life,’ only extended her life,” the group said.

Judge: State should counter with its own speech

Walker’s ruling said the government can’t censor the ad by claiming it is false:

Plaintiff’s argument is correct. While Defendant Ladapo refuses to even agree with this simple fact, Plaintiff’s political advertisement is political speech—speech at the core of the First Amendment. And just this year, the United States Supreme Court reaffirmed the bedrock principle that the government cannot do indirectly what it cannot do directly by threatening third parties with legal sanctions to censor speech it disfavors. The government cannot excuse its indirect censorship of political speech simply by declaring the disfavored speech is “false.”

State officials must show that their actions “were narrowly tailored to serve a compelling government interest,” Walker wrote. A “narrowly tailored solution” in this case would be counterspeech, not censorship, he wrote.

“For all these reasons, Plaintiff has demonstrated a substantial likelihood of success on the merits,” the ruling said. Walker wrote that a ruling in favor of the state would open the door to more censorship:

This case pits the right to engage in political speech against the State’s purported interest in protecting the health and safety of Floridians from “false advertising.” It is no answer to suggest that the Department of Health is merely flexing its traditional police powers to protect health and safety by prosecuting “false advertising”—if the State can rebrand rank viewpoint discriminatory suppression of political speech as a “sanitary nuisance,” then any political viewpoint with which the State disagrees is fair game for censorship.

Walker then noted that Ladapo “has ample, constitutional alternatives to mitigate any harm caused by an injunction in this case.” The state is already running “its own anti-Amendment 4 campaign to educate the public about its view of Florida’s abortion laws and to correct the record, as it sees fit, concerning pro-Amendment 4 speech,” Walker wrote. “The State can continue to combat what it believes to be ‘false advertising’ by meeting Plaintiff’s speech with its own.”

Photo of Jon Brodkin

Jon is a Senior IT Reporter for Ars Technica. He covers the telecom industry, Federal Communications Commission rulemakings, broadband consumer affairs, court cases, and government regulation of the tech industry.

Judge slams Florida for censoring political ad: “It’s the First Amendment, stupid” Read More »

desalination-system-adjusts-itself-to-work-with-renewable-power

Desalination system adjusts itself to work with renewable power


Instead of needing constant power, new system adjusts to use whatever is available.

Image of a small tanker truck parked next to a few shipping container shaped structures, which are connected by pipes to storage tanks.

Mobile desalination plants might be easier to operate with renewable power. Credit: Ismail BELLAOUALI

Fresh water we can use for drinking or agriculture is only about 3 percent of the global water supply, and nearly 70 percent of that is trapped in glaciers and ice caps. So far, that was enough to keep us going, but severe draughts have left places like Jordan, Egypt, sub-Saharan Africa, Spain, and California with limited access to potable water.

One possible solution is to tap into the remaining 97 percent of the water we have on Earth. The problem is that this water is saline, and we need to get the salt out of it to make it drinkable. Desalination is also an energy-expensive process. But MIT researchers led by Jonathan Bessette might have found an answer to that. They built an efficient, self-regulating water desalination system that runs on solar power alone with no need for batteries or a connection to the grid.

Probing the groundwaters

Oceans are the most obvious source of water for desalination. But they are a good option only for a small portion of people who live in coastal areas. Most of the global population—more or less 60 percent—lives farther than 100 kilometers from the coast, which makes using desalinated ocean water infeasible. So, Bessette and his team focused on groundwater instead.

“In terms of global demand, about 50 percent of low- to middle-income countries rely on groundwater,” Bessette says. This groundwater is trapped in underground reservoirs, abundant, and, in most places, present at depths below 300 meters. It comes mostly from the rain that penetrates the ground and fills empty spaces left by fractured rock formations. Sadly, as the rainwater seeps down it also picks up salts from the soil on its way. As a result, in New Mexico, for example, around 75 percent of groundwater is brackish, meaning less salty than seawater, but still too salty to drink.

Getting rid of the salt

We already have the ability to get the salt back out. “There are two broad categories within desalination technologies. The first is thermal and the other is based on using membranes,” Bessette explains.

Thermal desalination is something we figured out ages ago. You just boil the water and condense the steam, which leaves the salt behind. Boiling, however, needs lots of energy. Bringing 1 liter of room temperature water to 100° Celsius costs around 330 kilojoules of energy, assuming there’s no heat lost in the process. If you want a sense of how much energy that is, stop using your electric kettle for a month and see how your bill shrinks.

“So, around 100 years ago we developed reverse osmosis and electrodialysis, which are two membrane-based desalination technologies. This way, we reduced the power consumption by a factor of 10,” Bessette claims.

Reverse osmosis is a pressure-driven process; you push the water through a membrane that works like a very fine sieve that lets the molecules of water pass but stops other things like salts. Technologically advanced implementations of this idea are widely used at industrial facilities such as the Sydney Desalination Plant in Australia. Reverse osmosis today is the go-to technology when you want to desalinate water at scale. But it has its downsides.

“The issue is reverse osmosis requires a lot of pretreatment. We have to treat the water down to a pretty good quality, making sure the physical, chemical, or biological foul doesn’t end up on the membrane before we do the desalination process,” says Bessette. Another thing is that reverse osmosis relies on pressure, so it requires a steady supply of power to maintain this pressure, which is difficult to achieve in places where the grid is not reliable. Sensitivity to power fluctuations also makes it challenging to use with renewable energy sources like wind or solar. This is why to make their system work on solar energy alone, Bessette’s team went for electrodialysis.

Synching with the Sun

“Unlike reverse osmosis, electrodialysis is an electrically driven process,” Bessette says. The membranes are arranged in such a way that the water is not pushed through them but flows along them. On both sides of those membranes are positive and negative electrodes that create an electric field, which draws salt ions through the membranes and out of the water.

Off-grid desalination systems based on electrodialysis operate at constant power levels like toasters or other appliances, which means they require batteries to even out renewable energy’s fluctuations. Using batteries, in most cases, made them too expensive for the low-income communities that need them the most. Bessette and his colleagues solved that by designing a clever control system.

The two most important parameters in electrodialysis desalination are the flow rate of the water and the power you apply to the electrodes. To make the process efficient, you need to match those two. The advantage of electrodialysis is that it can operate at different power levels. When you have more available power, you can just pump more water through the system. When you have less power, you can slow the system down by reducing the water flow rate. You’ll produce less freshwater, but you won’t break anything this way.

Bessette’s team simplified the control down to two feedback loops. The first outer loop was tracking the power coming from the solar panels. On a sunny day, when the panels generated plenty of power, it fed more water into the system; when there was less power, it fed less water. The second inner loop tracked flow rate. When the flow rate was high, it applied more power to the electrodes; when it was low, it applied less power. The trick was to apply maximum available power while avoiding splitting the water into hydrogen and oxygen.

Once Bessette and his colleagues figured out the control system, they built a prototype desalination device. And it worked, with very little supervision, for half a year.

Water production at scale

Bessette’s prototype system, complete with solar panels, pumps, electronics, and an electrodialysis stack with all the electrodes and membranes, was compact enough to fit in a trailer. They took this trailer to the Brackish Groundwater National Research Facility in Alamogordo, New Mexico, and ran it for six months. On average, it desalinated around 5,000 liters of water per day—enough for a community of roughly 2,000 people.

“The nice thing with our technology is it is more of a control method. The concept can be scaled anywhere from this small community treatment system all the way to large-scale plants,” Bessette says. He said his team is now busy building an equivalent of a single water treatment train, a complete water desalination unit designed for big municipal water supplies. “Multiple such [systems] are implemented in such plants to increase the scale of water desalination process,” Bessette says. But he also thinks about small-scale solutions that can be fitted on a pickup truck and deployed rapidly in crisis scenarios like natural disasters.

“We’re also working on building a company. Me, two other staff engineers, and our professor. We’re really hoping to bring this technology to market and see that it reaches a lot of people. Our aim is to provide clean drinking water to folks in remote regions around the world,” Bessette says.

Nature Water, 2024.  DOI: 10.1038/s44221-024-00314-6

Photo of Jacek Krywko

Jacek Krywko is a freelance science and technology writer who covers space exploration, artificial intelligence research, computer science, and all sorts of engineering wizardry.

Desalination system adjusts itself to work with renewable power Read More »

tesla-fsd-crashes-in-fog,-sun-glare—feds-open-new-safety-investigation

Tesla FSD crashes in fog, sun glare—Feds open new safety investigation

Today, federal safety investigators opened a new investigation aimed at Tesla’s electric vehicles. This is now the 14th investigation by the National Highway Traffic Safety Administration and one of several currently open. This time, it’s the automaker’s highly controversial “full self-driving” feature that’s in the crosshairs—NHTSA says it now has four reports of Teslas using FSD and then crashing after the camera-only system encountered fog, sun glare, or airborne dust.

Of the four crashes that sparked this investigation, one caused the death of a pedestrian when a Model Y crashed into them in Rimrock, Arizona, in November 2023.

NHTSA has a standing general order that requires it to be told if a car crashes while operating under partial or full automation. Fully automated or autonomous means cars might be termed “actually self-driving,” such as the Waymos and Zooxes that clutter up the streets of San Francisco. Festooned with dozens of exterior sensors, these four-wheel testbeds drive around—mostly empty of passengers—gathering data to train themselves with later, with no human supervision. (This is also known as SAE level 4 automation.)

But the systems that come in cars that you or I could buy are far less sophisticated. Sometimes called “level 2+,” these systems (which include Tesla Autopilot, Tesla FSD, GM’s Super Cruise, BMW Highway Assistant, and Ford BlueCruise, among others) are partially automated, not autonomous. They will steer, accelerate, and brake for the driver, and they may even change lanes without explicit instruction, but the human behind the wheel is always meant to be in charge, even if the car is operating in a hands-free mode.

Tesla FSD crashes in fog, sun glare—Feds open new safety investigation Read More »

adobe-shows-off-3d-rotation-tool-for-flat-drawings

Adobe shows off 3D rotation tool for flat drawings

“That’s wizardry”

The on-stage demo showed off rotations for a number of varied images, from largely symmetrical dragons, horses, and bats to more complex shapes like a sketch of a bread basket or a living cup of fries (complete with arms, legs, eyes, and a mouth). In each case, the machine-learning algorithm does an admirable job assuming unseen parts of the model from what’s available in the original 2D view, extrapolating a full set of legs on a side-view horse or the bottom of the Fry Man’s shoes, for instance.

Vertical rotation lets you see the bottom of Fry Man’s shoes here.

Vertical rotation lets you see the bottom of Fry Man’s shoes here. Credit: Adobe

Still, we’re sure the vector models on stage were chosen to show Project Turntable in its best light. Without a public testable version, it’s hard to say how it would handle weird edge cases or drawings that don’t closely match objects in its training data (which we don’t know the extent of).

Even so, what was shown on stage has some obvious appeal for working artists. After seeing the on-stage video, Ars Creative Director Aurich Lawson exclaimed on our internal Slack, “That’s wizardry. I don’t know how well it really works—I bet not nearly as good as that demo a lot of the time—but I’m impressed.”

Project Turntable is also notable because it augments original work by human artists rather than replacing it with images created whole cloth by AI. While Project Turntable saves those artists the effort of drawing their 2D objects and characters from multiple angles, that human artist is still responsible for the overall style and look of that original work. Maintaining that human style seems to be a key point for Adobe, which points out that “even after the rotation, the vector graphics stay true to the original shape so you don’t lose any of the design’s essence.”

Adobe’s Brian Domingo told the Creative Bloq blog there’s still no guarantee that Project Turntable will ever be released commercially. Given the obvious enthusiasm of the demo crowd at the MAX conference, though, we think it’s safe to assume that Adobe will do whatever it can to get this feature ready for prime time as soon as possible.

Adobe shows off 3D rotation tool for flat drawings Read More »