Rejus Almole – Page 2

NASA nominee asks why lunar return has taken so long, and why it costs so much

jared isaacman, NASA, Space / Rejus Almole / April 10, 2025

WASHINGTON, DC—Over the course of a nearly three-hour committee hearing Wednesday, the nominee to lead NASA for the Trump administration faced difficult questions from US senators who sought commitments to specific projects.

However, maneuvering like a pilot with more than 7,000 hours in jets and ex-military aircraft, entrepreneur and private astronaut Jared Isaacman dodged most of their questions and would not be pinned down. His basic message to members of the Senate Committee on Commerce, Science, and Transportation was that NASA is an exceptional agency that does the impossible, but that it also faces some challenges. NASA, he said, receives an “extraordinary” budget, and he vowed to put taxpayer dollars to efficient use in exploring the universe and retaining the nation’s lead on geopolitical competitors in space.

“I have lived the American dream, and I owe this nation a great debt,” said Isaacman, who founded his first business at 16 in his parents’ basement and would go on to found an online payments company, Shift4, that would make him a billionaire. Isaacman is also an avid pilot who self-funded and led two private missions to orbit on Crew Dragon. Leading NASA would be “the privilege of a lifetime,” he said.

The hearing took place in the Russell Senate Office building next to the US Capitol on Wednesday morning, in an expansive room with marbled columns and three large chandeliers. There was plenty of spaceflight royalty on hand, including the four astronauts who will fly on the Artemis II mission, as well as the six private citizens who flew with Isaacman on his two Dragon missions.

“This may be the most badass assemblage we’ve had at a Senate hearing,” said US Sen. Ted Cruz, R-Texas, chair of the committee, commenting on the astronauts in the room.

Committed to staying at the Moon?

However, when the meeting got down to brass tacks, there were sharp questions for Isaacman.

Cruz opened the hearing by stating his priorities for NASA clearly and explicitly: He is most focused on ensuring the United States does not cede any of its preeminence to China in space, and this starts with low-Earth orbit and the Moon.

“Make no mistake, the Chinese Communist Party has been explicit in its desire to dominate space, putting a fully functional space station in low-Earth orbit and robotic rovers on the far side of the Moon,” he said. “We are not headed for the next space race; it is already here.”

Cruz wanted Isaacman to commit to not just flying human missions to the Moon, but also to a sustained presence on the surface or in cislunar space.

In response, Isaacman said he would see that NASA returns humans to the Moon as quickly as possible, beating China in the process. This includes flying Artemis II around the Moon in 2026, and then landing the Artemis III mission later this decade.

The disagreement came over what to do after this. Isaacman, echoing the Trump administration, said the agency should also press onward, sending humans to Mars as soon as possible. Cruz, however, wanted Isaacman to say NASA would establish a sustained presence at the Moon. The committee has written authorizing legislation to mandate this, Cruz reminded Isaacman.

“If that’s the law, then I am committed to it,” Isaacman said.

NASA astronauts Reid Wiseman, left, Victor Glover, Christina Koch, and CSA (Canadian Space Agency) astronaut Jeremy Hansen watch as Jared Isaacman testifies on Wednesday. Credit: NASA/Bill Ingalls

Cruz also sought Isaacman’s commitment to flying the International Space Station through at least 2030, which is the space agency’s current date for retiring the orbital laboratory. Isaacman said that seemed reasonable and added that NASA should squeeze every possible bit of research out of it until then. However, when Cruz pressed Isaacman about the Lunar Gateway, a space station NASA is developing to fly in an elliptical orbit around the Moon, Isaacman would not be drawn in. He replied that he would work with Congress and space agency officials to determine which programs are working and which ones are not.

The Gateway is a program championed by Cruz since it is managed by Johnson Space Center in Texas. Parochial interests aside, a lot of space community stakeholders question the value of the Gateway to NASA’s exploration plans.

Ten centers and the future of SLS

One of the most tense interactions came between Isaacman and Sen. Maria Cantwell, D-Wash., who wanted commitments from Isaacman that he would not close any of NASA’s 10 field centers, and also that the space agency would fly the Artemis II and Artemis III missions on the Space Launch System rocket.

Regarding field centers, there has been discussion about making the space agency more efficient by closing some of them. This is a politically sensitive topic, and naturally, politicians from states where those centers are located are protective of them. At the same time, there is a general recognition that it would be more cost-effective for NASA to consolidate its operations as part of modernization.

Isaacman did not answer Cantwell’s question about field centers directly. Rather, he said he had not been fully briefed on the administration’s plans for NASA’s structure. “Senator, there’s only so much I can be briefed on in advance of a hearing,” he said. In response to further prodding, Isaacman said, “I fully expect to roll up my sleeves” when it came to ideas to restructure NASA.

Cantwell and other Senators pressed Isaacman on plans to use NASA’s Space Launch System rocket as part of the overall plan to get astronauts to the lunar surface. Isaacman sounded as if he were on board with flying the Artemis II as envisioned—no surprise, then, that this crew was in the audience—and said he wanted to get a crew of Artemis III to the lunar surface as quickly as possible. But he questioned why it has taken NASA so long, and at such great expense, to get its deep space human exploration plans moving.

He noted, correctly, that presidential administrations dating back to 1989 have been releasing plans for sending humans to the Moon or Mars, and that significantly more than $100 billion has been spent on various projects over nearly four decades. For all of that, Isaacman and his private Polaris Dawn crewmates remain the humans to have flown the farthest from Earth since the Apollo Program. They did so last year.

“Why is it taking us so long, and why is it costing us so much to go to the Moon?” he asked.

In one notable exchange, Isaacman said NASA’s current architecture for the Artemis lunar plans, based on the SLS rocket and Orion spacecraft, is probably not the ideal “long-term” solution to NASA’s deep space transportation plans. The smart reading of this is that Isaacman may be willing to fly the Artemis II and Artemis III missions as conceived, given that much of the hardware is already built. But everything that comes after this, including SLS rocket upgrades and the Lunar Gateway, could be on the chopping block. Ars wrote more about why this is a reasonable path forward last September.

Untangling a relationship with SpaceX

Some of the most intelligent questions came from US Sen. Andy Kim, D-New Jersey. During his time allotment, Kim also pressed Isaacman on the question of a sustained presence on the Moon. Isaacman responded that it was critical for NASA to get astronauts on the Moon, along with robotic missions, to determine the “economic, scientific, and national security value” of the Moon. With this information, he said, NASA will be better positioned to determine whether and why it should have an enduring presence on the Moon.

If this were so, Kim subsequently asked what the economic, scientific, and national security value of sending humans to Mars was. Not responding directly to this question, Isaacman reiterated that NASA should do both Moon and Mars exploration in parallel. NASA will need to become much more efficient to afford that, and some of the US Senators appeared skeptical. But Isaacman seems to truly believe this and wants to take a stab at making NASA more cost-effective and “mission focused.”

Throughout the hearing, Isaacman appeared to win the approval of various senators with his repeated remarks that he was committed to NASA’s science programs and that he was eager to help NASA uphold its reputation for making the impossible possible. He also said it is a “fundamental” obligation of the space agency to inspire the next generation of scientists.

A challenging moment came during questioning from Sen. Edward Markey, D-Mass., who expressed his concern about Isaacman’s relationship to SpaceX founder Elon Musk. Isaacman was previously an investor in SpaceX and has paid for two Dragon missions. In a letter written in March, Isaacman explained how he would disentangle his “actual and apparent” conflicts of interest with SpaceX.

However, Markey wanted to know if Isaacman would be pulling levers at NASA for Musk, and for the financial benefit of SpaceX. Markey pressed multiple times on whether Musk was in the room at Mar-A-Lago late last year when Trump offered Isaacman the position of NASA administrator. Isaacman declined to say, reiterating multiple times that his meeting was with Trump, not anyone else. Asked if he had discussed his plans for NASA with Musk, Isaacman said, “I have not.”

Earlier in the hearing, Isaacman sought to make clear that he was not beholden to Musk in any way.

“My loyalty is to this nation, the space agency, and its world-changing mission,” Isaacman said. Yes, he acknowledged he would talk to contractors for the space agency. It is important to draw on a broad range of perspectives, Isaacman said. But he wanted to make this clear: NASA works for the nation, and the contractors, he added, “work for us.”

A full committee vote on Isaacman is expected later this month after April 15, and if successful, the nomination would pass to the full Senate. Isaacman could be confirmed late this month or in May.

NASA nominee asks why lunar return has taken so long, and why it costs so much Read More »

Take It Down Act nears passage; critics warn Trump could use it against enemies

AI, Deepfakes, Policy, take it down act / Rejus Almole / April 10, 2025

Anti-deepfake bill raises concerns about censorship and breaking encryption.

The helicopter with outgoing US President Joe Biden and first lady Dr. Jill Biden departs from the East Front of the United States Capitol after the inauguration of Donald Trump on January 20, 2025 in Washington, DC. Credit: Getty Images

An anti-deepfake bill is on the verge of becoming US law despite concerns from civil liberties groups that it could be used by President Trump and others to censor speech that has nothing to do with the intent of the bill.

The bill is called the Tools to Address Known Exploitation by Immobilizing Technological Deepfakes On Websites and Networks Act, or Take It Down Act. The Senate version co-sponsored by Ted Cruz (R-Texas) and Amy Klobuchar (D-Minn.) was approved in the Senate by unanimous consent in February and is nearing passage in the House. The House Committee on Energy and Commerce approved the bill in a 49-1 vote yesterday, sending it to the House floor.

The bill pertains to “nonconsensual intimate visual depictions,” including both authentic photos shared without consent and forgeries produced by artificial intelligence or other technological means. Publishing intimate images of adults without consent could be punished by a fine and up to two years of prison. Publishing intimate images of minors under 18 could be punished with a fine or up to three years in prison.

Online platforms would have 48 hours to remove such images after “receiving a valid removal request from an identifiable individual (or an authorized person acting on behalf of such individual).”

“No man, woman, or child should be subjected to the spread of explicit AI images meant to target and harass innocent victims,” House Commerce Committee Chairman Brett Guthrie (R-Ky.) said in a press release. Guthrie’s press release included quotes supporting the bill from first lady Melania Trump, two teen girls who were victimized with deepfake nudes, and the mother of a boy whose death led to an investigation into a possible sextortion scheme.

Free speech concerns

The Electronic Frontier Foundation has been speaking out against the bill, saying “it could be easily manipulated to take down lawful content that powerful people simply don’t like.” The EFF pointed to Trump’s comments in an address to a joint session of Congress last month, in which he suggested he would use the bill for his own ends.

“Once it passes the House, I look forward to signing that bill into law. And I’m going to use that bill for myself too if you don’t mind, because nobody gets treated worse than I do online, nobody,” Trump said, drawing laughs from the crowd at Congress.

The EFF said, “Congress should believe Trump when he says he would use the Take It Down Act simply because he’s ‘treated badly,’ despite the fact that this is not the intention of the bill. There is nothing in the law, as written, to stop anyone—especially those with significant resources—from misusing the notice-and-takedown system to remove speech that criticizes them or that they disagree with.”

Free speech concerns were raised in a February letter to lawmakers sent by the Center for Democracy & Technology, the Authors Guild, Demand Progress Action, the EFF, Fight for the Future, the Freedom of the Press Foundation, New America’s Open Technology Institute, Public Knowledge, and TechFreedom.

The bill’s notice and takedown system “would result in the removal of not just nonconsensual intimate imagery but also speech that is neither illegal nor actually NDII [nonconsensual distribution of intimate imagery]… While the criminal provisions of the bill include appropriate exceptions for consensual commercial pornography and matters of public concern, those exceptions are not included in the bill’s takedown system,” the letter said.

The letter also said the bill could incentivize online platforms to use “content filtering that would break encryption.” The bill “excludes email and other services that do not primarily consist of user-generated content from the NTD [notice and takedown] system,” but “direct messaging services, cloud storage systems, and other similar services for private communication and storage, however, could be required to comply with the NTD,” the letter said.

The bill “contains serious threats to private messaging and free speech online—including requirements that would force companies to abandon end-to-end encryption so they can read and moderate your DMs,” Public Knowledge said today.

Democratic amendments voted down

Rep. Yvette Clarke (D-N.Y.) cast the only vote against the bill in yesterday’s House Commerce Committee hearing. But there were also several party-line votes against amendments submitted by Democrats.

Democrats raised concerns both about the bill not being enforced strictly enough and that bad actors could abuse the takedown process. The first concern is related to Trump firing both Democratic members of the Federal Trade Commission.

Rep. Kim Schrier (D-Wash.) called the Take It Down Act an “excellent law” but said, “right now it’s feeling like empty words because my Republican colleagues just stood by while the administration fired FTC commissioners, the exact people who enforce this law… it feels almost like my Republican colleagues are just giving a wink and a nod to the predators out there who are waiting to exploit kids and other innocent victims.”

Rep. Darren Soto (D-Fla.) offered an amendment to delay the bill’s effective date until the Democratic commissioners are restored to their positions. Ranking Member Frank Pallone, Jr. (D-N.J.) said that with a shorthanded FTC, “there’s going to be no enforcement of the Take It Down Act. There will be no enforcement of anything related to kids’ privacy.”

Rep. John James (R-Mich.) called the amendment a “thinly veiled delay tactic” and “nothing less than an attempt to derail this very important bill.” The amendment was defeated in a 28-22 vote.

Democrats support bill despite losing amendment votes

Rep. Debbie Dingell (D-Mich.) said she strongly supports the bill but offered an amendment that she said would tighten up the text and close loopholes. She said her amendment “ensures constitutionally protected speech is preserved by incorporating essential provisions for consensual content and matters of public concern. My goal is to protect survivors of abuse, not suppress lawful expression or shield misconduct from public accountability.”

Dingell’s amendment was also defeated in a 28-22 vote.

Pallone pitched an amendment that he said would “prevent bad actors from falsely claiming to be authorized from making takedown requests on behalf of someone else.” He called it a “common sense guardrail to protect against weaponization of this bill to take down images that are published with the consent of the subject matter of the images.” The amendment was rejected in a voice vote.

The bill was backed by RAINN (Rape, Abuse & Incest National Network), which praised the committee vote in a statement yesterday. “We’ve worked with fierce determination for the past year to bring Take It Down forward because we know—and survivors know—that AI-assisted sexual abuse is sexual abuse and real harm is being done; real pain is caused,” said Stefan Turkheimer, RAINN’s VP of public policy.

Cruz touted support for the bill from over 120 organizations and companies. The list includes groups like NCMEC (National Center for Missing & Exploited Children) and the National Center on Sexual Exploitation (NCOSE), along with various types of advocacy groups and tech companies Microsoft, Google, Meta, IBM, Amazon, and X Corp.

“As bad actors continue to exploit new technologies like generative artificial intelligence, the Take It Down Act is crucial for ending the spread of exploitative sexual material online, holding Big Tech accountable, and empowering victims of revenge and deepfake pornography,” Cruz said yesterday.

Jon is a Senior IT Reporter for Ars Technica. He covers the telecom industry, Federal Communications Commission rulemakings, broadband consumer affairs, court cases, and government regulation of the tech industry.

Take It Down Act nears passage; critics warn Trump could use it against enemies Read More »

Trump boosts China tariffs to 125%, pauses tariff hikes on other countries

china tariffs, consumer technology, Donald Trump, Policy, tariffs, us-china trade war / Rejus Almole / April 10, 2025

On Wednesday, Donald Trump, once again, took to Truth Social to abruptly shift US trade policy, announcing a 90-day pause “substantially” lowering reciprocal tariffs against all countries except China to 10 percent.

Because China retaliated—raising tariffs on US imports to 84 percent on Wednesday—Trump increased tariffs on China imports to 125 percent “effective immediately.” That likely will not be received well by China, which advised the Trump administration to cancel all China tariffs Wednesday, NPR reported.

“The US’s practice of escalating tariffs on China is a mistake on top of a mistake,” the Chinese finance ministry said, calling for Trump to “properly resolve differences with China through equal dialogue on the basis of mutual respect.”

For tech companies, trying to keep up with Trump’s social media posts regarding tariffs has been a struggle, as markets react within minutes. It’s not always clear what Trump’s posts mean or how the math will add up, but after Treasury Secretary Scott Bessent clarified Trump’s recent post, the stock market surged, CNBC reported, after slumping for days.

But even though the stock market may be, for now, recovering, tech companies remain stuck swimming in uncertainty. Ed Brzytwa, vice president of international trade for the Consumer Technology Association (CTA)—which represents the $505 billion US consumer technology industry—told Ars that for many CTA members, including small businesses and startups, “the damage has been done.”

“Our small business and startup members were uniquely exposed to these reciprocal tariffs and the whipsaw effect,” Brzytwa told Ars. “There’s collateral damage to that.”

In a statement, CTA CEO Gary Shapiro suggested that the pause was “a victory for American consumers,” but ultimately the CTA wants Trump to “fully revoke” the tariffs.

“While this is great news, we are hearing directly from our members that the ongoing additional 10 percent universal baseline tariffs and this continued uncertainty, are already hurting American small businesses,” Shapiro said. “CTA urges President Trump to focus his efforts on what he does best, dealmaking. Now is the time to reposition the United States with our allies as a reliable trading partner while growing the American and global economy.”

Trump boosts China tariffs to 125%, pauses tariff hikes on other countries Read More »

OpenAI helps spammers plaster 80,000 sites with messages that bypassed filters

AI, Biz & IT, large language models, LLMs, openai, Security, spam / Rejus Almole / April 10, 2025

“AkiraBot’s use of LLM-generated spam message content demonstrates the emerging challenges that AI poses to defending websites against spam attacks,” SentinelLabs researchers Alex Delamotte and Jim Walter wrote. “The easiest indicators to block are the rotating set of domains used to sell the Akira and ServiceWrap SEO offerings, as there is no longer a consistent approach in the spam message contents as there were with previous campaigns selling the services of these firms.”

AkiraBot worked by assigning the following role to OpenAI’s chat API using the model gpt-4o-mini: “You are a helpful assistant that generates marketing messages.” A prompt instructed the LLM to replace the variables with the site name provided at runtime. As a result, the body of each message named the recipient website by name and included a brief description of the service provided by it.

An AI Chat prompt used by AkiraBot Credit: SentinelLabs

“The resulting message includes a brief description of the targeted website, making the message seem curated,” the researchers wrote. “The benefit of generating each message using an LLM is that the message content is unique and filtering against spam becomes more difficult compared to using a consistent message template which can trivially be filtered.”

SentinelLabs obtained log files AkiraBot left on a server to measure success and failure rates. One file showed that unique messages had been successfully delivered to more than 80,000 websites from September 2024 to January of this year. By comparison, messages targeting roughly 11,000 domains failed. OpenAI thanked the researchers and reiterated that such use of its chatbots runs afoul of its terms of service.

Story updated to modify headline.

OpenAI helps spammers plaster 80,000 sites with messages that bypassed filters Read More »

Llama Does Not Look Good 4 Anything

LLaMA / Rejus Almole / April 10, 2025

Llama Scout (17B active parameters, 16 experts, 109B total) and Llama Maverick (17B active parameters, 128 experts, 400B total), released on Saturday, look deeply disappointing. They are disappointing on the level of ‘people think they have to be misconfigured to be this bad,’ and people wondering and debating how aggressively the benchmarks were gamed.

This was by far the most negative reaction I have seen to a model release, the opposite of the reaction to Gemini 2.5 Pro. I have seen similarly deeply disappointing and misleading releases, but they were non-American models from labs whose benchmarks and claims we have learned not to take as representing model capabilities.

After this release, I am placing Meta in that category of AI labs whose pronouncements about model capabilities are not to be trusted, that cannot be relied upon to follow industry norms, and which are clearly not on the frontier. Until they show otherwise, they clearly do not belong in the category that includes OpenAI, Anthropic, Google, xAI and DeepSeek.

Techikansh: I am just gonna leave this here…

Meta released the first two Llama 4 models last Saturday, and there is a code change indicating that the original plan was to do it Monday and it got moved up. In general, releasing on a Saturday is such bad strategy it simply isn’t done. Zuck says ‘that’s when it was ready’ but that is not an explanation.

People are wondering why made an exception and did it anyway. I have two hypotheses for what happened (note: I do not have any private information here).

They moved it up because the tariffs were about to potentially cause a Black Monday stock market crash, and Meta wanted to get ahead of that to protect themselves and also to not have the release buried under other news. This seems entirely reasonable under the circumstances.
They released on Saturday to bury it, because it isn’t any good.

Those two look to be at cross-purposes, but I’m not so sure. Suppose, for the sake of argument here, that Llama-4 sucks.

Investors can’t really tell the difference, especially not by Monday.
Those who can tell the difference would be less likely to notice or talk about it.

Who knows. That’s all speculation.

What I do know is that the Llama 4 models released so far seem to not be any good.

You can download Llama 4 Scout and Maverick at Hugging Face or from llama.com. You can try it on the web, or within Meta’s products.

They offer a Llama license, which is rather obnoxious, restricting large companies from using it and requiring rather prominent acknowledgment of Llama’s use, including putting ‘Llama’ in the title and adhering to the ‘acceptable use policy.’

Putting such requirements on otherwise open weight models gives an advantage to overseas companies and governments, especially the PRC, that can and will simply ignore such rules, while handicapping American companies.

European companies are of course handicapped even more, they literally are not given a license at all, blame whoever you want for that part.

Lech Mazur: Large, it will be tough for enthusiasts to run them locally. The license is still quite restrictive. I can see why some might think it doesn’t qualify as open source.

Not cool. Be open, or be closed.

This may be part of a consistent pattern. We just saw this story by Allan Smith that Sarah Wynn-Williams, a former Facebook employee, will testify before Congress today that Meta executives undermined U.S. national security and briefed Chinese officials on emerging technologies like artificial intelligence. I don’t know if this is true, but ‘Meta has been cooperating with China for ordinary business reasons’ might be the explanation for a lot of its AI decisions.

If the models were good, this would potentially be a rather big deal.

In terms of techniques used, I take their announcement post to be ‘I hear you like mixture-of-expert LLMs and scaling up so I got you some scaled up MoEs to go with your scaled up MoEs.’ This includes the size in parameters and also amount of data.

I would take Meta’s outright statement of ‘newest model suite offering unrivaled speed and efficiency’ as an almost certainly false claim, as is the following quote from them. As in, they are sufficiently false as to downgrade my trust in Meta’s claims, which was never all that high.

Meta: Llama 4 Maverick, a 17 billion active parameter model with 128 experts, is the best multimodal model in its class, beating GPT-4o and Gemini 2.0 Flash across a broad range of widely reported benchmarks, while achieving comparable results to the new DeepSeek v3 on reasoning and coding—at less than half the active parameters.

That’s a bold claim. Feedback does not back this up.

The two features they do offer are support for 200 languages, and in theory a long context window. I say in theory because it’s easy to offer long context so you can tout it, and hard to make that long context do anything useful and preserve performance. Needle in a haystack is not a good measure of practical use here. Whereas to skip ahead to one private benchmark, Fiction.live, that tries to use that long context, it goes historically bad, the worst they’ve ever seen, even at 60k.

Meta offer some benchmarks, which many noted seem selected, and they also select their competition.

Anyone keeping up with LLM progress can see the choices here are a little suspicious.

Artificial Analysis confirms the scores, but only on the benchmarks Meta chose.

The Llama models are giant mixture of experts (MoE) models, similar to (and presumably because of and copying) DeepSeek’s v3 and r1. Scout is 17B active parameters, 16 experts, 109B total. Maverick is 17B active, 128 experts, 400B total. The unreleased Behemoth is huge, 288B active, 16 experts and 2T total parameters.

That means that while they are optimized to run fast on an H100, they can’t be run at all on a 4090 GPU or other similar consumer hardware, which negates one of the big advantages of open models. I presume you can run Scout and Maverick (quantized) on my Mac Studio, and I might well do that, but that’s a hefty ask.

Jeff Dean: Sure, but you can run it on 4 or 8 of them, no?

Jeremy Howard: Yes I can; as can you. But I’m primarily interested in what’s widely available in the community, where a single 4090 GPU machine is already a very rich investment.

Remember also that 3090s were the last consumer card with nvlink, so 4090 and 5090 cards aren’t good at multi gpu

Jeff Dean: Fwiw, this exact reason is why we made the Gemma 3 open source models something that developers could easily run on a single GPU or TPU.

And if you have only one or two GPUs and you want to run the model as fast as you can, here’s an RL algorithm that can help figure out how to use those GPU(s) plus your CPU to go as fast as you can with whatever hardware you have

Luke Metro: Apple Silicon’s using its large amount of unified memory for big on-device AI models might be the hardware coup of the decade if Apple Intelligence is able to get its sh*t together.

The strongest data point in Llama 4’s favor is the Arena ranking of 1417. That is good for second place, which is indeed impressive if it is reflective of general performance.

Alas, as we all know by now, Arena is being used as an optimization target. Was that done here? We don’t know.

Other signs like the selective benchmarks they released are suggestive of such a strategy, and they would be far from the only ones. Janus asks what other than Goodharting explains the rise in Arena ratings for new models, I think that’s definitely a lot of it, or for things that aren’t actually Arena but are highly corrected to area.

What does Arena optimize for? A random internet user prefers your response to another model’s response.

What makes people prefer one response to another? We can also look at the actual responses, and see, now that Arena has released answers for review.

Morgan: i probably arrive too late but the lmsys voter’s preference for sycophantic yapping is particularly clear this time

Wh: These examples are extremely damning on the utility of Chatbot arena as a serious benchmark. Look through all the examples that Maverick won, and it’s slop after slop after slop. This is the nonsense you are optimizing for if you are trying to goodhart lmsys. Let’s be serious.

This is the clearest evidence that no one should take these rankings seriously.

In this example it’s super yappy and factually inaccurate, and yet the user voted for Llama 4. The rest aren’t any better.

Always start by profusely telling the user how smart they are.

TDM: Struggling to find a single answer in this that is less than 100 lines and doesn’t make me throw up.

AKR: Llama 4 Maverick Experimental vs Claude 3.7 Sonnet

Prompt: Create a web page that shows the current the current month as a table, with no border lines, and has button to move to the previous and next month. It also has the ability to show a bar that can go horizontally across the days in a week to indicate a daily streak.

3.7 Sonnet won easily because of the “Add Streak for Current Week” button which is clearly what’s needed as the prompt. It also better UI imo.

But on the LMArena Experimental Battles UI, the user selected the Llama 4 Mav Exp as the better model

Goes to show that you should never believe these benchmarks unless you really try it out yourself.

Hasan Can: When I said [a well-known AI company is clearly manipulating Arena via watermarking] back on March 28th, nobody offered support. Now, time has come to put a final nail in lmarena’s coffin.

These answers by Maverick, that users voted for, seem absurdly obnoxious and bad. I originally wrote ‘these make me want to puke,’ erased it, but now that I see TDM saying the same thing I’m putting that observation back in. This is the opposite of what I want.

And indeed, this also potentially explains Claude Sonnet 3.7’s low Arena ranking. What if people really do prefer syncopathy and lengthy slop? It exists for a reason.

It’s clear Llama-4 fell victim to Goodhart’s Law, either to Arena rankings directly or to a similar other ranking process they used in fine tuning.

We also know that this version of Maverick on Arena is not the same as the one they released, and it seems, shall we say, ‘slopified.’

The question is, is that all that happened? Did they also outright cheat to get this Arena ranking? I opened a Manifold market, unfortunately we likely never know for sure but I figured something was better than nothing here, suggestions for better resolution methods welcome. When I say ‘cheating’ I mean something beyond ‘a version optimized to do well on Arena.’ I mean actual outright cheating.

Did they flat out cheat?

Peter Wildeford: According to The Information, delays were due to the model underperforming on technical benchmarks. In my opinion, it still seems like Meta was pretty selective about the metrics they chose to use (and the metrics they didn’t) and how they did the comparisons, suggesting the model may not be that good.

Satya Benson: The interesting story here is the allegations of cheating on the benchmarks. I’d love to get better sense of to what extent this really happened and how bad the cheating is relative to other models.

First Worldist: My understanding is they tested “experimental” models without disclosing these models were trained specifically for the benchmarks

There’s at least one claim that they did fix that partly via cheating, obviously take with tons of salt given the sourcing.

I wouldn’t think Meta would go this far, for the same reasons as Peter, so I doubt it happened. Nor would they have had to go this far. You actually have to work hard to not accidentally de facto train on benchmarks when using 22T+ tokens.

So while I’m quoting the post for posterity, I assume this accusation is probably false.

Peter Wildeford: I don’t believe the conspiracy theories about training on the test set, but I do think they’ve been highly selective in which metrics they picked in order to pretend to be better than they are.

The fact that the Chatbot Arena is a different bot than the ones getting the math scores is also telling.

Leo: It’s a pretty big no-no in ML, and seems unlikely that Meta researchers would torch their reputation risking something like this. Would need strong evidence to be convinced otherwise.

Peter Wildeford: Agreed. Accusation seems unlikely on priors and the evidence isn’t sufficient to move me enough.

Rrryougi (I doubt the claims here are true, but they seem too important not to include in the record): Original post is in Chinese that can be found here. Please take the following with a grain of salt.

Content:

Despite repeated training efforts, the internal model’s performance still falls short of open-source SOTA benchmarks, lagging significantly behind. Company leadership suggested blending test sets from various benchmarks during the post-training process, aiming to meet the targets across various metrics and produce a “presentable” result. Failure to achieve this goal by the end-of-April deadline would lead to dire consequences. Following yesterday’s release of Llama 4, many users on X and Reddit have already reported extremely poor real-world test results.

As someone currently in academia, I find this approach utterly unacceptable. Consequently, I have submitted my resignation and explicitly requested that my name be excluded from the technical report of Llama 4. Notably, the VP of AI at Meta also resigned for similar reasons.

Ortegaalfredo: “Meta’s head of AI research announces departure – Published Tue, Apr 1 2025”

At least that part is true. Ouch.

There is however this:

Hasan Can: This [below] might potentially constitute first solid evidence suggesting Llama 4 was actually trained on benchmarks.

Kaixuan Huang: Just tested Llama4-Scout on our MATH-Perturb benchmark. There is a surprising 18% gap between Original and MATH-P-Simple, making it unique among the 20+ models that came out after 2024.

It doesn’t look great. Here is it in an easier to read form:

That sure looks like cheating. Again, it doesn’t mean they intentionally train on the test set. If you have 22T+ tokens and throw the entire internet at your model, there’s going to be contamination. All you have to do is not sufficiently care about not training on benchmarks. Alternatively, you can hill climb on your test scores.

Previously, I would have doubted Meta would let this happen. Now, I have less doubt.

This would not be the first time Meta has broken similar norms.

Holly Elmore: I don’t want to speak out of turn but it doesn’t seem out of character for Meta to me. They knowingly stole libgen and downloaded it via Tor bc they knew it would look bad. The informal ethics of ML are unfortunately not the reassurance I was hoping for.

Those sources seem rather illegal. Meta don’t care. What are you going to do about it?

It is 2025. In general, ‘[X] would goes against norms’ is no longer seen as so strong an argument against doing [X]. The question is now, if I do [X], yes it is against norms, but even if you figure out that I did that, what are you going to do about it?

That goes double for ‘not doing enough to prevent [X] would go against norms.’

This is everything I could find that plausibly counts as a benchmark. There are some benchmarks where Maverick is mid, others where it is less than mid.

I don’t know if ARC-AGI counts as ‘independent benchmarks’ but Maverick scored 4.38% and Scout 0.5% on ARC-AGI-1 and both got 0.00% on ARC-AGI-2.

On Livebench, Llama 4 Maverick does relatively okay with a 54.38, right behind DeepSeek R1 Distill Llama 70B and Gemini 2.0 Flash.

Here are the Lech Mazur benchmarks.

Extended Word Connections (which is de facto a reasoning benchmark):

Confabulations, it gets a 22.6 here, which is rather not good:

On Creative Writing Llama Maverick bombs super hard, Llama are the three bars on the left:

In the Elimination game, things again don’t go great.

It also does not do well in Thematic Generation or Step-Game Battles where even Llama 3.3 70B kicks its ass, as does almost everything else.

BigCodeBench didn’t go great, although Llama-4-Maverick did marginally beat out Gemma-3-27B.

Markus Zimmerman reports results for DevQualityEval v1.0, and they ‘do not look good,’ they are more than halfway down a very long chart of only open models.

Harvard Ihle is here with WeirdML, Maverick is in the middle, doing pretty well relative to other benchmarks.

In general, if you have your own benchmark, it doesn’t look good:

George: the most complementary informed takes have come from shrek, eg.

the most damning critical takes (imo) have come from curators of lesser known benchmarks, on which the new models are not performing well. The EQBench site has a couple (/they bombed), bigcodebench had Maverick coming in well below DSv2 (not a typo). Aider Polyglot bench was similarly bleak.

And here by “most damning” I am intentionally excluding takes informed by the sloptimized version that was sent to lmsys. Meta folks are chalking some of the poor results up to implementation issues, but on at least one benchmark (long context fiction) the proprietors have tried three different implementations and netted similarly disappointing scores each time.

This was Aider polyglot:

Here’s that positive viewpoint, from xjdr, clearly in the context of open models only, essentially saying that Maverick is a specialized model and is good in particular for agentic and tool calling work and for that purpose it is good:

xjdr: my detailed personal benchmarks ran overnight.

– Scout is best at summarization and function calling. exactly what you want from a cheap long ctx model. this is going to be a workhorse in coding flows and RAG applications. the single shot ICL recall is very very good.

– Maverick was built for replacing developers and doing agenic / tool calling work. it is very consistent in instruction following, very long context ICL and parallel multi tool calls. this is EXACTLY the model and capabilities i want in my coder style flows. it is not creative, i have V3 and R1 for that tho. multimodal is very good at OCR and charts and graphs outperforming both 4o and qwen 2.5 VL 72 in my typical tests. the only thing i haven’t tested is computer use but i doubt it will beat sonnet or qwen at that as both models were explicitly trained for it. The output is kind of bland (hence the constant 4o comparisons) with little personality, which is totally fine. this is a professional tool built for professional work (testing it on RP or the like will lead to terrible results). Im not sure what more you could ask for in a agent focused model.

– V3-0324 is not consistent enough with tool calling output to be useful but when it gets it right, it is always the clear and best choice. however, it excels at creativity, problem solving and multi-turn interactions. this will continue to be my non-function calling workhorse. the 131k ctx feels remarkably restrictive now tho. i am going to do some more long ctx testing on V3 cause im almost positive i can get more out of it (200k – 300k ideally), but i think this is where MLA is going to show its tradeoffs. FIM and completion are also huge V3 specific wins here and places where it not only excels but is really in a league of its own.

– R1 continues to be the smartest and most creative model available when used single shot, single turn and when prompted correctly. its the genius in the corner who cant make eye contact but if you properly specify a problem it will be solved with an incredibly high degree of confidence. Function calling (really all of the V3 features) work as expected but the formatting is a bit 1/2 baked and doubly so when you use them with tool use. however, with proper parsing and sampling effort, its a truly remarkable model.

– All of these models benefit tremendously from proper sampling and lovingly crafted matmuls and accumulations. they are all much better and smarter than what is generally available from lmsys or openrouter.

I am incredibly bullish on Behemoth and R2 and cannot wait to fold them into my daily workflow. I have never been happier about the state of open source models and since the R1 launch and when used correctly they provide a viable alternative to frontier models for the first time. I am happy to answer and specific questions but this is probably my last general post on this. i gotta get back to work …

I suppose that is possible. Perhaps it has its niche and will be good at that niche once people adapt to it and scaffold it well. But that’s definitely not how Meta is presenting Maverick or the future Behemoth.

It’s weird to call it a ‘benchmark’ but worth noting that Llama 4 Scout and Maverick did not exhibit alignment faking in a new test.

Another sort-of benchmark would be red teaming, done here by Virtue AI. Alas, their tests seem to be against mundane risks only. They find that Llama 4 is significantly less compliant with AI regulations than Claude 3.7 or GPT-4.5, ‘lagging behind peers,’ and evaluations show ‘noticeable weaknesses’ against mundane harms, despite what they call ‘Maverick’s caution dilemma’ and false refusals.

That is distinct from asking about misuse, malicious fine-tuning or other sources of potential catastrophic risk from an open weights model – as always, ‘the license says you cannot do that’ is going to get ignored here. One presumes that the main defense is that these models lack the capability to cause new trouble here, at least in the absence of Behemoth.

Or, here is what people are saying in other realms.

Yair Halberstadt: Reviews on Reddit were that it was total trash, so bad they assume it must be misconfigured or something.

I’ve had confirmation of Yair’s statement from other reliable sources.

Murat: just tried llama 4 scout on groq cloud. 512 tok/s is great

however just like all the other eval-optimized models (like claude 3.7, o3-mini etc.) it doesn’t follow instructions properly. i can’t use it as drop-in replacement for my existing prompt pipelines.

just tried llama maverick. same thing. unimpressed.

grok lacks api so sonnet 3.5 is still my main squeeze.

Medo 42: Personal toy benchmark (a coding task I give to every new model): Not good at all. Shares last place with Gemini 2.0 Pro 02-07 now.

Roughly: “The code returned an array of objects in the right shape and one of the fields of the objects had the right value most of the time”

Scaling01: Llama-4-Yapper strikes again

I can’t even run tic-tac-toe bench properly because Llama-4-400B can’t shut up and just answer with 1 number.

Llama-4-109B can for some reason.

Who was the biggest cheerleader that doesn’t work at Meta?

AI and crypto czar David Sacks: Congrats to the @AIatMeta team on the launch of their new Llama 4 open-weights models. For the U.S. to win the AI race, we have to win in open source too, and Llama 4 puts us back in the lead.

Peter Wildeford: Google is so bad at marketing that @davidsacks47 doesn’t praise Gemma 3.

Failure to mention Gemma 3 feels like strong mood affectation, on top of the marketing issues. Google is known as a closed lab, Meta is known as open. But mainly yes, Google’s marketing is atrocious. But a claim that Gemma 3 put us back in the lead was a lot more defensible than one about Llama 4.

The Llama tokenizer is a place you might fear to tread.

Kalomaze: if at any point someone on your team says

“yeah we need 10 special tokens for reasoning and 10 for vision and another 10 for image generation and 10 agent tokens and 10 post tr-“

you should have slapped them

this is what happens when that doesn’t happen

Minh Nhat Nguyen: do not go into the llama tokenizer dot json. worst mistake of my life.

tbf i think the reserved llama tokens are nice for ablation experiments, but they rly go overboard with it

Jim Fan says ‘Llama-4 doesn’t disappoint’ but his response seems entirely based on Meta’s claims and reports rather than any independent assessment of performance.

All general reports on feedback say that people are disappointed. It was so disappointing that mostly people treated it as a non-event until asked.

Mena Fleischman: I haven’t seen anything particularly complimentary. They held off on dropping Behemoth which was supposed to be the real showcase of something SOTA, and next-best Maverick in their own stats got mostly beat by Deepseek, who was already beaten on release.

Very weak showing.

Andriy Burkov: If today’s disappointing release of Llama 4 tells us something, it’s that even 30 trillion training tokens and 2 trillion parameters don’t make your non-reasoning model better than smaller reasoning models.

Model and data size scaling are over.

Along similar lines, Alexander Doria doesn’t see much point in giving 40T tokens to Llama-4 Scout, and 22T to Llama-4 Maverick.

I don’t think this means model and data size scaling are over. I think it means that if you do not know how to execute, sheer size will not save you, and probably gives you smaller marginal gains than if you executed well.

The big takeaway is that we have to downgrade expectations for Meta in AI, and also our expectations for how much we can trust Meta.

Despite vastly superior resources, Meta now seems to be trying to copy DeepSeek and coming up short. Exactly how short depends on who you ask. And Meta is, to an unknown degree, making a deliberate effort to make its models look good on benchmarks in ways that violate norms.

It is hard to count out a top tech company with tons of compute and almost endless capital. They could still turn this ship around. But they’re going to have to turn this ship around, and do it fast, if they want to be competitive.

Right now, America’s open model champion isn’t Meta. It is Google with Gemma 3, and soon it may also be OpenAI, which is planning an open reasoning model soon. I realize that causes some dissonance, but that’s where we are. Beware mood affectation.

Discussion about this post

Llama Does Not Look Good 4 Anything Read More »

Framework “temporarily pausing” some laptop sales because of new tariffs

AMD, framework, framework laptop 13, intel, tariffs, Tech, trump tariffs / Rejus Almole / April 8, 2025

Framework, the designers and sellers of the modular and repairable Framework Laptop 13 and other products, announced today that it would be “temporarily pausing US sales” on some of its laptop configurations as a result of new tariffs put on Taiwanese imports by the Trump administration. The affected models will be removed from Framework’s online store for now, and there’s no word on when buyers can expect them to come back.

“We priced our laptops when tariffs on imports from Taiwan were 0 percent,” the company responded to a post asking why it was pausing sales. “At a 10 percent tariff, we would have to sell the lowest-end SKUs at a loss.”

“Other consumer goods makers have performed the same calculations and taken the same actions, though most have not been open about it,” Framework said. Nintendo also paused US preorders for its upcoming Switch 2 console last week after the tariffs were announced.

For right now, Framework’s sales pause affects at least two specific laptop configurations: the Intel Core Ultra 5 125H and AMD Ryzen 5 7640U versions of the Framework Laptop 13. As of April 1, Framework was selling pre-built versions of those laptops for $999 and $899, respectively. Without those options, the cheapest versions of those laptops start at $1,399 and $1,499.

Framework “temporarily pausing” some laptop sales because of new tariffs Read More »

Meta’s surprise Llama 4 drop exposes the gap between AI ambition and reality

AI, Biz & IT, LLaMA, Llama 3, Llama 4, machine learning, Meta, Simon Willison / Rejus Almole / April 8, 2025

Meta constructed the Llama 4 models using a mixture-of-experts (MoE) architecture, which is one way around the limitations of running huge AI models. Think of MoE like having a large team of specialized workers; instead of everyone working on every task, only the relevant specialists activate for a specific job.

For example, Llama 4 Maverick features a 400 billion parameter size, but only 17 billion of those parameters are active at once across one of 128 experts. Likewise, Scout features 109 billion total parameters, but only 17 billion are active at once across one of 16 experts. This design can reduce the computation needed to run the model, since smaller portions of neural network weights are active simultaneously.

Llama’s reality check arrives quickly

Current AI models have a relatively limited short-term memory. In AI, a context window acts somewhat in that fashion, determining how much information it can process simultaneously. AI language models like Llama typically process that memory as chunks of data called tokens, which can be whole words or fragments of longer words. Large context windows allow AI models to process longer documents, larger code bases, and longer conversations.

Despite Meta’s promotion of Llama 4 Scout’s 10 million token context window, developers have so far discovered that using even a fraction of that amount has proven challenging due to memory limitations. Willison reported on his blog that third-party services providing access, like Groq and Fireworks, limited Scout’s context to just 128,000 tokens. Another provider, Together AI, offered 328,000 tokens.

Evidence suggests accessing larger contexts requires immense resources. Willison pointed to Meta’s own example notebook (“build_with_llama_4“), which states that running a 1.4 million token context needs eight high-end Nvidia H100 GPUs.

Willison documented his own testing troubles. When he asked Llama 4 Scout via the OpenRouter service to summarize a long online discussion (around 20,000 tokens), the result wasn’t useful. He described the output as “complete junk output,” which devolved into repetitive loops.

Meta’s surprise Llama 4 drop exposes the gap between AI ambition and reality Read More »

Google’s AI Mode search can now answer questions about images

AI, ai search, Artificial Intelligence, Google, Google Gemini, google search, Tech / Rejus Almole / April 8, 2025

Google started cramming AI features into search in 2024, but last month marked an escalation. With the release of AI Mode, Google previewed a future in which searching the web does not return a list of 10 blue links. Google says it’s getting positive feedback on AI Mode from users, so it’s forging ahead by adding multimodal functionality to its robotic results.

AI Mode relies on a custom version of the Gemini large language model (LLM) to produce results. Google confirms that this model now supports multimodal input, which means you can now show images to AI Mode when conducting a search.

As this change rolls out, the search bar in AI Mode will gain a new button that lets you snap a photo or upload an image. The updated Gemini model can interpret the content of images, but it gets a little help from Google Lens. Google notes that Lens can identify specific objects in the images you upload, passing that context along so AI Mode can make multiple sub-queries, known as a “fan-out technique.”

Google illustrates how this could work in the example below. The user shows AI Mode a few books, asking questions about similar titles. Lens identifies each individual title, allowing AI Mode to incorporate the specifics of the books into its response. This is key to the model’s ability to suggest similar books and make suggestions based on the user’s follow-up question.

Google’s AI Mode search can now answer questions about images Read More »

Dustland Delivery plays like a funny, tough, post-apocalyptic Oregon Trail

dustland delivery, Fallout, faster than light, ftl, gaming, PC games, RPG, steam / Rejus Almole / April 6, 2025

Road trips with just two people always have their awkward silences. In Dustland Delivery, my character, a sharpshooter, has tried to break the ice with the blacksmith he hired a few towns back, with only intermittent success.

Remember that bodyguard, the one I unsuccessfully tried to flirt with at that bar? The blacksmith was uninterested. What about that wily junk dealer, or the creepy cemetery? Silence. She only wanted to discuss “Abandoned train” and “Abandoned factory,” even though, in this post-apocalypse, abandonment was not that rare. But I made a note to look out for any rusted remains; stress and mood are far trickier to fix than hunger and thirst.

Dustland Delivery release trailer.

Dustland Delivery, available through Steam for Windows (and Proton/Steam Deck), puts you in the role typically taken up by NPCs in other post-apocalyptic RPGs. You’re a trader, buying cheap goods in one place to sell at a profit elsewhere, and working the costs of fuel, maintenance, and raider attacks into your margins. You’re in charge of everything on your trip: how fast you drive, when to rest and set up camp, whether to approach that caravan of pickups or give them a wide berth.

Some of you, the types whose favorite part of The Oregon Trail was the trading posts, might already be sold. For the others, let me suggest that the game is stuffed full of little bits of weird humor and emergent storytelling, and a wild amount of replayability for what is currently a $5 game. There are three quest-driven scenarios, plus a tutorial, in the base game. A new DLC out this week, Sheol, adds underground cities, ruins expeditions, more terrains, and a final story quest for four more dollars.

Dustland Delivery plays like a funny, tough, post-apocalyptic Oregon Trail Read More »

Switch 2 preorders delayed over Trump tariff uncertainty

gaming / Rejus Almole / April 6, 2025

Nintendo Switch 2 preorders, which were due to begin on April 9, are being delayed indefinitely amid the financial uncertainty surrounding Donald Trump’s recent announcement of massive tariffs on most US trading partners.

“Pre-orders for Nintendo Switch 2 in the U.S. will not start April 9, 2025 in order to assess the potential impact of tariffs and evolving market conditions,” Nintendo said in a statement cited by Polygon. “Nintendo will update timing at a later date. The launch date of June 5, 2025 is unchanged.”

Nintendo announced launch details for the Switch 2 on Wednesday morning, just hours before Trump’s afternoon “Liberation Day” press conference announcing the biggest increase in import duties in modern US history. Those taxes on practically all goods imported into the United States are set to officially go into effect on April 9, the same day Nintendo had planned to roll out Switch 2 preorders for qualified customers.

Welcome to day 2 of Nintendo Treehouse Live’s “drop the price” stream

[image or embed]

— AmericanTruckSongs10 (@ethangach.bsky.social) April 4, 2025 at 10: 14 AM

The delay in the preorder date comes as outspoken gamers online are making plenty of noise over the Switch 2’s higher-than-expected $450 price point and over Switch 2 software pricing falling in the $70 to $80 range. Nintendo’s promotional “Treehouse” streams showing Switch 2 gameplay have been inundated with a nonstop torrent of chatters demanding the company “DROP THE PRICE.”

Yet today’s announcement suggests that Nintendo might need to “assess” whether even a $450 price is feasible given the additional taxes the company will now have to pay to import systems manufactured in countries like China and Vietnam into the United States. Alternatively, Nintendo could eat the cost of any tariffs and sell its console hardware at a loss, as it has in the past, in an attempt to make that money back in software sales.

Switch 2 preorders delayed over Trump tariff uncertainty Read More »

SpinLaunch—yes, the centrifuge rocket company—is making a hard pivot to satellites

Space, spinlaunch / Rejus Almole / April 6, 2025

Outside of several mentions in the Rocket Report newsletter dating back to 2018, Ars Technica has not devoted too much attention to covering a novel California space company named SpinLaunch.

That’s because the premise is so outlandish as to almost not feel real. The company aims to build a kinetic launch system that spins a rocket around at speeds up to 4,700 mph (7,500 km/h) before sending it upward toward space. Then, at an altitude of 40 miles (60 km) or so, the rocket would ignite its engines to achieve orbital velocity. Essentially, SpinLaunch wants to yeet things into space.

But the company was no joke. After being founded in 2014, it raised more than $150 million over the next decade. It built a prototype accelerator in New Mexico and performed a series of flight tests. The flights reached altitudes of “tens of thousands” of feet, according to the company, and were often accompanied by slickly produced videos.

SpinLaunch goes quiet

Following this series of tests, by the end of 2022, the company went mostly quiet. It was unclear whether it ran out of funding, had hit some technical problems in trying to build a larger accelerator, or what. Somewhat ominously, SpinLaunch’s founder and chief executive, Jonathan Yaney, was replaced without explanation last May. The new leader would be David Wrenn, then serving as chief operating officer.

“I am confident in our ability to execute on the company’s mission and bring our integrated tech stack of low-cost space solutions to market,” Wrenn said at the time. “I look forward to sharing more details about our near- and long-term strategy in the coming months.”

Words like “tech stack” and “low-cost space solutions” sounded like nebulous corporate speak, and it was not clear what they meant. Nor did Wrenn immediately deliver on that promise, nearly a year ago, to share more details about the company’s near- and long-term strategy.

SpinLaunch—yes, the centrifuge rocket company—is making a hard pivot to satellites Read More »

Old faces in unexpected places: The Wheel of Time season 3 rolls on

culture, tv recap, wheel of time, wheel of time season 3 / Rejus Almole / April 6, 2025

Andrew Cunningham and Lee Hutchinson have spent decades of their lives with Robert Jordan and Brandon Sanderson’s Wheel of Time books, and they previously brought that knowledge to bear as they recapped each first season episode and second season episode of Amazon’s WoT TV series. Now we’re back in the saddle for season 3—along with insights, jokes, and the occasional wild theory.

These recaps won’t cover every element of every episode, but they will contain major spoilers for the show and the book series. We’ll do our best to not spoil major future events from the books, but there’s always the danger that something might slip out. If you want to stay completely unspoiled and haven’t read the books, these recaps aren’t for you.

New episodes of The Wheel of Time season 3 will be posted for Amazon Prime subscribers every Thursday. This write-up covers episode six, “The Shadow in the Night,” which was released on April 3.

Lee: Welcome to Tanchico! In Tanchico, everyone wears veils almost all of the time, except when they’re flirting in bars. Mat gets the most fabulous veil of all because he’s Mat and he deserves it. Even Nynaeve has a good time! And I guess now we know all about the hills of Tanchico. Like… alllllllllllllllllll about them.

Andrew: Credit to Robert Jordan for mostly resisting one of the bizarre tics of post-Tolkien fantasy fiction: I’m not going to say the books never take a break to give us the full text of an in-universe song. But it does so pretty sparingly, if memory serves. But there are plenty of songs referenced, often with a strong implication that they are too lewd or horny to reprint in full.

Not so in the show! Where Elayne sings a song about “The Hills of Tanchico,” bringing the house down for what appears to be… several hours (they’re breasts, the hills are breasts). I don’t mind this scene, actually, but it does go on.

But more important than the song is who is accompanying Elayne, a book character who has been gone so long that we weren’t actually sure he was coming back. Who makes their long-awaited return in Tanchico, Lee?

Image of Thom Merrilin, back at last. — Thom Merrilin finally shows back up. Nice hat. Wonder who else might end up wearing it. Credit: Prime/Amazon MGM Studios

Lee: That’s right, ladies and gentlemen, boys and girls, children of all ages, stomp your feet and bring your hands together for everybody’s favorite gleeman, seemingly back from the dead and rocking a strangely familiar hat: It’s Thom Merrilin! (Applause roars.)

Viewers who haven’t read the books can be forgiven for not immediately falling out of their chairs when Thom shows back up, but to book readers, his absence has been keenly felt. Believe it or not, Merrilin is an A-string player in the books, spending a tremendous amount of time front and center interacting with the main POV characters. He vanishes for a bit just as he does in the show, but he doesn’t stay gone nearly as long as he’s been gone here.

I’m glad he’s back, and it bodes well for our Tanchico crew—unlike them, Thom is an actual-for-real adult, who’s been places and knows things. He also provides fantastic accompaniment to Elayne’s karaoke adventure.

Image of Elayne singing karaoke — Elayne wins the crowd by singing about tittays. Thom accompanies because it’s a subject in which he is apparently well-versed. Credit: Prime/Amazon MGM Studios

Andrew: The entire Tanchico crew is pretty strong right now—Mat and Min are pals again, show-Nynaeve is a version of the character who other characters in the story are allowed to like, and now Thom is back! It’d be a rollicking good time, if it weren’t for these sadistic Black Ajah Aes Sedai and the Forsaken psychopath Moghedien stalking around, mind-controlling people, leaving holes in heads, and trying to find a Seanchan-esque collar that can subdue and control Rand.

We’re entering a stretch of the story where the Forsaken spend as much time fighting with each other as they do with Rand and our heroes, which explains why the powerful villains don’t simply kill our heroes the minute they find each other. Moghedien is in full creep mode through this whole episode, and I gotta say, she is unsettling.

Image of Moggy being Moggy — Moghedien, doing her thing. Credit: Prime/Amazon MGM Studios

Lee: Yeah, watching Moghedien screw with the Black sisters’ food and stuff was particularly disturbing. The lady has no filter—and fantastic powers of persuasion. We get another clear look at just how ludicrously overpowered the Forsaken are compared to our present-day channelers when Moggy straight-up runs “sudo give me the bracelet” on Nynaeve’s and Elayne’s brains—much like Rhavin’s I’m-your-favorite-uncle routine, her Power-backed trickery is devastating and completely inescapable (though Nynaeve apparently does resist just a teeny tiny bit.)

And although there are still more doings to discuss in Tanchico—the quest to discover the bracelets-n-collars is heating up!—the fact that all of these episodes are an hour long means there are so many other things to discuss. Like, for example, the return of another familiar face, in the form of our long-absent whistling super-darkfriend Padan Fain. Dark doings are afoot in the Two Rivers!

Andrew: Fain in the books never quite rises to the level of Big Bad so much as he lurks around the periphery of the story practically the whole entire time, popping up to cause trouble whenever it’s the least convenient for our heroes. The show does a good job of visually representing how he’s begun to corrupt the regiment of Whitecloaks he has embedded himself in, without ever actually mentioning it or drawing much attention to it. You know you’re a bad guy when even Eamon Valda is like “uh is this guy ok?” (As in the books, the show distinguishes between Whitecloaks who are antagonists because they genuinely believe what they say they believe about Aes Sedai “witches,” and ones who are simply straight-up Darkfriends. Funny how often they end up working toward the same goals, though.)

Meanwhile, Perrin, Alanna, and friends recover from last week’s raid of the Whitecloak camp. I keep needing to recalibrate my expectations for what Plot Armor looks like on this show, because our main characters get grievously wounded pretty regularly, but the standards are different on a show where everyone can apparently cast Cure Wounds as a cantrip. Alanna walks the Cauthon sisters through some rudimentary Healing, and Alanna (with barely disguised glee and/or interest) accidentally interrupts an escalation in Perrin and Falie’s relationship when she goes to Heal him later.

Are we still finding show-Faile charming? I did think it was funny when that goofy county-fair caricature of Mat holding the Horn of Valere made another appearance.

Image of Faile and Perrin — Still not hating Faile, which feels surprising. Credit: Prime/Amazon MGM Studios

Lee: I am definitely still finding show-Faile charming, which continually surprises me because she’s possibly the worst character in the entire series. In the books, Jordan writes Faile as an emotionally abused emotional abuser who doesn’t believe Perrin loves her if he’s not screaming at her and/or hitting her; in the show, she’s a much more whole individual with much more grown-up and sane ideas about how relationships work. Perrin and Faile have something going on that is, dare I say it, actually sweet and romantic!

I never thought I’d be on any team other than Team Throw-Faile-Down-The-Well, but here we are. I’m rooting for her and Perrin.

When it comes to Alanna’s healing at the hands of the Cauthon sisters, I had to sit with that one for a moment and make a conscious decision. The books make it clear that Healing—even the abbreviated first-aid version the current-day Aes Sedai practice, to say nothing of the much fancier version from the Age of Legends—is complicated. Doing it wrong can have horrific consequences (in fact, “doing healing wrong on purpose” is the basis for many of the Yellow-turned-Black sisters’ attacks with the One Power). And these wildlings (to borrow a book term) are able to just intuit their way into making it happen?

We know that new channelers frequently have uncontrolled bouts of blasting out the One Power in response to moments of stress or great need—in fact, we’ve seen that happen many times in the show, including at the beginning of this episode when Lil’ Liandrin force-blasts her rapist-husband into the wall. So the groundwork is there for the Cauthon girls to do what they’re doing. It’s just a question of how much one is willing to let the show get away with.

I decided I’m good with it—it’s the necessary thing to move the story forward, and so I’m not gonna complain about it. Where did you land?

Image of Padan Fain — Fain returns, bringing with him the expected pile of Trollocs. Credit: Prime/Amazon MGM Studios

Andrew: Yeah, I made essentially the same decision. Conscious use of the One Power at all, even the ability to access it consistently, is something that requires patience and training, and normally you couldn’t talk a 12-year-old through Healing as Alanna does here any more than you could talk a 12-year-old through performing successful field surgery. But training takes time, and showing it takes time, and time is one thing the show never has much of. The show also really likes to dramatically injure characters without killing them! So here we are, speed-running some things.

This leaves us with two big threads left to address: Rand’s and Egwene’s. Egwene is still trying to learn about the World of Dreams from the Aiel Wise Ones (I was wrong, by the way—she admits to lying about being Aes Sedai here and it passes almost without comment), and is still reeling from realizing that Rand and Lanfear are Involved. And Rand, well. He’s not going mad, yet, probably, but he spends most of the episode less-than-fully-in-control of his powers and his actions.

Lee: It comes to a head when Rand and Egwene have long, difficult conversation over exactly who’s been sleeping with whom, and why—and then that conversation is interrupted when Sammael kicks the door down and starts swinging his big fancy One Power Hammer.

There’s a bit of channeling by Aviendha and Egwene, but then Rand grasps the Source and Sammael just kind of stops being a factor. Entranced by the Power—and by the black corruption pulsing through it—Rand straight-up destroys Sammael without apparent thought or effort, borrowing a bit of the method from the way Rand pulls off a similar feat in book 3, with a ludicrous amount of lightning and ceiling-collapsing.

It’s one of the few times so far that Rand has actually cut loose with the One Power, and I like it when we get to actually see (rather than just hear about) the enormity of Rand’s strength as a channeler. But this casual exercise of extreme power is not without a cost.

Image of Rand killing a Forsaken — Rand does a 360 no-scope lightning hit. Credit: Prime/Amazon MGM Studios

Andrew: We’ve observed a couple of times that Rand and Egwene in the books had long since given up on romantic involvement by this point in the story, and here we see why the show held back on that—this confrontation is more exciting than a quiet drift, and it puts a cap on several “Rand is not the simple lad you once knew” moments sprinkled throughout the episode.

And, yes, one of them is Rand’s inadvertent (if sadly predictable) killing of an Aiel girl he had forged a bond with, and his desperate, fruitless, unsavory attempt to force her back to life. Rand is simultaneously coming to grips with his destiny and with the extent to which he has no idea what he is doing, and both things are already causing pain to the people around him. And as you and I both know, book-Rand has counterproductive and harmful reactions to hurting people he cares about.

The attack here is partly an invention of the show and partly a synthesis of a few different book events, but Forsaken coming at Rand directly like this is generally not a thing that happens much. They usually prefer to take up positions of power in the world’s various kingdoms and only fight when cornered. All of this is to say, I doubt this is the last we see of Sammael or his Thor-looking One Power hammer, but the show is more than willing to go its own way when it wants to.

Lee: Yeah, Rand doing saidin-CPR on Rhuarc’s poor little too-cute-not-to-be-inevitably-killed granddaughter is disturbing as hell—and as you say, it’s terrifying not just because Rand is forcing a corpse to breathe with dark magic, but also because of the place Rand seems to go in his head when he’s doing it. It’s been an oft-repeated axiom that male channelers inevitably go mad—is this it? (Fortunately, no—not yet, at least. Or is it? No! Maybe.)

We close the episode out on the place where I think we’re going to probably be spending a lot of time very soon (especially based on the title of next week’s episode, which I won’t spoil but which anyone can look up if they wish): back at the Two Rivers, with the power-trio of Bain and Chiad and Faile scouting out the old Waygate just outside of town, and watching Trollocs swarm out of it. This is not a great sign for Perrin and friends.

So we’ve got two episodes left, all of our chess pieces seem to have been set down more or less into the right places for a couple of major climactic events. I think we’re going out with a bang—or with a couple of them. What are you thinking as we jump into the final couple of episodes?

Image of a dead girl walking. — Alsera fell victim to one of the classic child character blunders: being too precociously adorable to live. Credit: Prime/Amazon MGM Studios

Andrew: I am going to reiterate our annual complaint that 10-episode seasons would be better for this show’s storytelling than the 8-episode seasons we’re getting, but because the show’s pace is always so breathless and leaves room for just a few weird character-illuminating diversions like “The Hills of Tanchico,” or quiet heart-to-hearts like we get between Rand and Moiraine, or between Perrin and Faile. The show’s good enough at these that I wish we had time to pump the brakes more often.

But I will say, if we end up roughly where book 4 does, the show doesn’t feel as rushed as the first two seasons did. Not that its pacing has settled down at all—you and I benefit immensely from being book readers, and always being rooted in some sense of what is happening and who the characters are that the show can’t always convey with perfect clarity. But I am thinking about what still needs to happen, and how much time there is left, and thinking “yeah, they’re going to be able to get there” instead of “how the hell are they going to get there??”

How are you feeling? Is season 3 hitting for you like it is for me? I know I’m searching around every week to see if there’s been a renewal announcement for season 4 (not yet).

Lee: I think it’s the best season so far, and any doubts I had during seasons one and two are at this point long gone. I’m all in on this particular turning of the Wheel, and the show finally feels like it’s found itself. To not renew it at this point would be criminal. You listening, Bezos? May the Shadow take you if you yank the rug out from under us now!

Andrew: Yeah, Jeffrey. I know for a fact you’ve spent money on worse things than this.

Credit: WoT Wiki

Old faces in unexpected places: The Wheel of Time season 3 rolls on Read More »

Author name: Rejus Almole